The document discusses elementary statistics and statistical methodology. It covers descriptive statistics topics like mean, median, mode, variance and standard deviation. It also covers inferential statistics topics like hypothesis testing, t-tests, z-tests, F-tests, ANOVA, correlation, and regression. Examples of applying statistical analyses in Excel are provided, including calculating confidence intervals and performing hypothesis tests to compare sample means.
This document provides information on performing a one-way analysis of variance (ANOVA). It discusses the F-distribution, key terms used in ANOVA like factors and treatments, and how to calculate and interpret an ANOVA test statistic. An example demonstrates how to conduct a one-way ANOVA to determine if three golf clubs produce different average driving distances.
This document provides an overview of one-way analysis of variance (ANOVA) and randomized block ANOVA. It introduces ANOVA as a technique to compare two or more population means by analyzing sample variances. For one-way ANOVA, it demonstrates how to calculate sums of squares for treatments (SST), errors (SSE), and test statistics to determine if there are differences in population means. For randomized block ANOVA, it shows how total variability is partitioned into sums of squares for treatments, blocks, and errors, and how mean squares are used to calculate test statistics to analyze differences between treatments and blocks.
The Mann-Whitney U test is a nonparametric statistical test used to compare two independent groups when the dependent variable is either ordinal or continuous, but not normally distributed. It works by ranking all the observations from both groups together and comparing the sums of the ranks between the two groups. The student conducted traffic counts before and after a new retail development to test if there was a significant difference using the Mann-Whitney U test, calculating U values for each group and comparing them to the critical value at a 0.05 significance level. The test revealed a significant difference, suggesting the development impacted local traffic flows.
The document discusses multiple linear regression analysis to predict gasoline mileage using automobile data. It covers the basics of regression modeling, assessing model fit, and diagnostics. Key steps include fitting a linear regression model of miles per gallon as the response variable against predictors like vehicle weight, engine size, and more. The document also demonstrates how to perform the multiple regression analysis in R using the automobile data set.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 10: Correlation and Regression
10.1: Correlation
The document provides an overview of correlation, regression, and other statistical methods. It defines correlation as measuring the association between two variables, while regression finds the best fitting line to predict a dependent variable from an independent variable. Simple linear regression uses one predictor variable, while multiple linear regression uses two or more. Logistic regression is used for nominal dependent variables. Nonlinear regression fits curved lines to nonlinear data. The document provides examples and guidelines for choosing the appropriate statistical test based on the type of variables.
This document provides an overview of various statistical tests for comparing variables, including t-tests, ANOVA, MANOVA, ANCOVA, and MANCOVA. It defines each test and provides examples of their proper usage. T-tests are used to compare two groups on a continuous variable, including paired and unpaired, parametric and non-parametric versions. ANOVA and MANOVA are used to compare three or more groups and two or more dependent variables, respectively. ANCOVA and MANCOVA control for covariates/confounding variables in one-way and two-way designs with single or multiple dependent variables. Examples and best practices are given for selecting and conducting each type of test.
Chi-Square test for independence of attributes / Chi-Square test for checking association between two categorical variables, Chi-Square test for goodness of fit
This document provides information on performing a one-way analysis of variance (ANOVA). It discusses the F-distribution, key terms used in ANOVA like factors and treatments, and how to calculate and interpret an ANOVA test statistic. An example demonstrates how to conduct a one-way ANOVA to determine if three golf clubs produce different average driving distances.
This document provides an overview of one-way analysis of variance (ANOVA) and randomized block ANOVA. It introduces ANOVA as a technique to compare two or more population means by analyzing sample variances. For one-way ANOVA, it demonstrates how to calculate sums of squares for treatments (SST), errors (SSE), and test statistics to determine if there are differences in population means. For randomized block ANOVA, it shows how total variability is partitioned into sums of squares for treatments, blocks, and errors, and how mean squares are used to calculate test statistics to analyze differences between treatments and blocks.
The Mann-Whitney U test is a nonparametric statistical test used to compare two independent groups when the dependent variable is either ordinal or continuous, but not normally distributed. It works by ranking all the observations from both groups together and comparing the sums of the ranks between the two groups. The student conducted traffic counts before and after a new retail development to test if there was a significant difference using the Mann-Whitney U test, calculating U values for each group and comparing them to the critical value at a 0.05 significance level. The test revealed a significant difference, suggesting the development impacted local traffic flows.
The document discusses multiple linear regression analysis to predict gasoline mileage using automobile data. It covers the basics of regression modeling, assessing model fit, and diagnostics. Key steps include fitting a linear regression model of miles per gallon as the response variable against predictors like vehicle weight, engine size, and more. The document also demonstrates how to perform the multiple regression analysis in R using the automobile data set.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 10: Correlation and Regression
10.1: Correlation
The document provides an overview of correlation, regression, and other statistical methods. It defines correlation as measuring the association between two variables, while regression finds the best fitting line to predict a dependent variable from an independent variable. Simple linear regression uses one predictor variable, while multiple linear regression uses two or more. Logistic regression is used for nominal dependent variables. Nonlinear regression fits curved lines to nonlinear data. The document provides examples and guidelines for choosing the appropriate statistical test based on the type of variables.
This document provides an overview of various statistical tests for comparing variables, including t-tests, ANOVA, MANOVA, ANCOVA, and MANCOVA. It defines each test and provides examples of their proper usage. T-tests are used to compare two groups on a continuous variable, including paired and unpaired, parametric and non-parametric versions. ANOVA and MANOVA are used to compare three or more groups and two or more dependent variables, respectively. ANCOVA and MANCOVA control for covariates/confounding variables in one-way and two-way designs with single or multiple dependent variables. Examples and best practices are given for selecting and conducting each type of test.
Chi-Square test for independence of attributes / Chi-Square test for checking association between two categorical variables, Chi-Square test for goodness of fit
The document discusses goodness-of-fit tests for categorical data. It introduces notation for categorical variables with multiple categories and hypotheses for goodness-of-fit tests. Expected counts are calculated based on hypothesized proportions. The chi-square statistic is used to calculate test statistics and P-values are found using the chi-square distribution. Examples demonstrate applying goodness-of-fit tests to determine if variable categories occur with equal frequency.
The chapter discusses analysis of variance (ANOVA), including one-way and two-way ANOVA tests. It outlines the goals of understanding when to use ANOVA, different ANOVA designs, how to perform single-factor hypothesis tests and interpret results, conduct post-hoc multiple comparisons procedures, and analyze two-factor ANOVA tests. The key aspects covered include partitioning total variation into between-group and within-group variation, calculating sum of squares, mean squares, and F statistics to test for differences between group means. Post-hoc procedures like Tukey-Kramer are also introduced to determine which specific group means are significantly different from each other.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 12: Analysis of Variance
12.1: One-Way ANOVA
Version 8 of SigmaXL statistical software includes several new features that make multiple comparisons easier. It adds Analysis of Means charts for comparing normal, binomial, and Poisson distributions in one-way and two-way settings. It also improves multiple comparisons procedures for one-way ANOVA, adds tests for equal variances, improves chi-square tests and associations, and includes new descriptive statistics, templates, and calculators.
T test, independant sample, paired sample and anovaQasim Raza
The document discusses various statistical analyses that can be performed in SPSS, including t-tests, ANOVA, and post-hoc tests. It provides details on one-sample t-tests, independent t-tests, paired t-tests, one-way ANOVA tests, and evaluating assumptions like normality. Examples are given on how to conduct these tests in SPSS and how to interpret the output. Guidance is provided on follow-up post-hoc tests that can be used after ANOVA to examine differences between specific groups.
The document discusses hypothesis testing methods for comparing two population or treatment means. It covers notation, sampling distributions, large sample hypothesis testing, confidence intervals, and paired t-tests. An example compares the mean fill volumes of two beer can filling machines and constructs a 98% confidence interval for the difference in tensile strengths of two thread types.
The document provides an overview of regression analysis techniques, including linear regression and logistic regression. It explains that regression analysis is used to understand relationships between variables and can be used for prediction. Linear regression finds relationships when the dependent variable is continuous, while logistic regression is used when the dependent variable is binary. The document also discusses selecting the appropriate regression model and highlights important considerations for linear and logistic regression.
This document discusses various statistical concepts including outliers, transforming data, normalizing data, weighting data, robustness, and homoscedasticity and heteroscedasticity. Outliers are values far from other data points and should be carefully examined before removing. Data can be transformed using logarithms, square roots, or other functions to better fit a normal distribution or equalize variances between groups. Normalizing data puts variables on comparable scales. Weighting data adjusts for under- or over-representation in samples. Robust tests are resistant to violations of assumptions. Homoscedasticity refers to equal variances between groups while heteroscedasticity refers to unequal variances.
The document discusses analysis of variance (ANOVA) which is used to compare the means of three or more groups. It explains that ANOVA avoids the problems of multiple t-tests by providing an omnibus test of differences between groups. The key steps of ANOVA are outlined, including partitioning variation between and within groups to calculate an F-ratio. A large F value indicates more difference between groups than expected by chance alone.
This document discusses quantitative research methods and analysis of variance (ANOVA). It covers one-way ANOVA, which allows comparison of three or more groups, and examples comparing differences between age groups and types of bumpers. Requirements for ANOVA like normality and independence are addressed. Post-hoc tests for identifying specific group differences are also introduced.
1. The document discusses linear correlation and regression between plasma amphetamine levels and amphetamine-induced psychosis scores using data from 10 patients.
2. A positive correlation was found between the two variables, and a linear regression equation was established to predict psychosis scores from amphetamine levels.
3. However, further statistical tests were needed to determine if the correlation and regression model could be generalized to the overall patient population.
The document provides instructions for performing the Mann-Whitney U test and the Chi-squared test. The Mann-Whitney U test can be used to compare two independent groups when the dependent variable is either ordinal or continuous. It involves ranking all observations from both groups together and comparing the sums of the ranks from each group. The Chi-squared test determines if there is a significant association between two categorical variables. It involves calculating expected frequencies and comparing them to observed frequencies using a Chi-squared distribution. Examples are given for performing both tests and interpreting their results.
Correlation & Regression Analysis using SPSSParag Shah
Concept of Correlation, Simple Linear Regression & Multiple Linear Regression and its analysis using SPSS. How it check the validity of assumptions in Regression
The document describes how to conduct and interpret a paired samples t-test in SPSS. It explains that a paired samples t-test is used to compare the means of two related variables measured on the same subjects. It provides an example using reaction time data collected from participants before and after drinking a beer. It outlines the steps to check assumptions, run the t-test in SPSS, and interpret the output, finding that participants had significantly slower reaction times after consuming alcohol.
The document discusses how to use a chi-squared (x2) test to examine differences between observed and expected frequencies of categorical data. It provides guidelines for when a chi-squared test is appropriate, how to perform the calculation, and how to interpret the results. A case study example is presented of a student analyzing questionnaire responses about the 2012 Olympics using a chi-squared test to determine if response frequencies differed significantly between demographic groups.
This document provides an overview of elementary statistics topics including descriptive statistics, inferential statistics, probability, different types of data and scales of measurement, common statistical tests like t-tests, z-tests, F-tests, chi-square tests, ANOVA, correlation, and regression. It also includes examples of how to calculate and interpret descriptive statistics like the mean, median, mode, variance, and standard deviation. Examples are provided on how to set up and conduct hypothesis tests using Excel.
This document provides an overview of descriptive statistics, inferential statistics, and regression analysis using PASW Statistics software. It discusses topics such as frequency analysis, measures of central tendency, hypothesis testing, t-tests, ANOVA, chi-square tests, correlation, and linear regression. The document is divided into multiple parts that cover opening and manipulating data files, descriptive statistics, tests of significance, regression analysis, and chi-square/ANOVA. It also discusses importing/exporting data and using scripts in PASW Statistics.
This document provides an overview of key concepts in inferential statistics. Inferential statistics allows researchers to make inferences about populations based on samples. It includes techniques like hypothesis testing, t-tests, analysis of variance (ANOVA), regression analysis, and more. The goal is to determine if observed differences are statistically significant rather than due to chance. Inferential statistics helps estimate parameters and analyze variability using statistical models and software.
The document discusses hypothesis testing and statistical analysis techniques. It covers univariate, bivariate, and multivariate statistical analysis, which involve one, two, or three or more variables, respectively. The key steps of hypothesis testing are outlined, including deriving a null hypothesis from the research objectives, obtaining and measuring a sample, comparing the sample value to the hypothesis, and determining whether to support or not support the hypothesis based on consistency. Type I and Type II errors in hypothesis testing are defined. Common statistical tests like chi-square, t-tests, ANOVA, and correlation are introduced along with concepts like significance levels, p-values, and degrees of freedom.
The document discusses goodness-of-fit tests for categorical data. It introduces notation for categorical variables with multiple categories and hypotheses for goodness-of-fit tests. Expected counts are calculated based on hypothesized proportions. The chi-square statistic is used to calculate test statistics and P-values are found using the chi-square distribution. Examples demonstrate applying goodness-of-fit tests to determine if variable categories occur with equal frequency.
The chapter discusses analysis of variance (ANOVA), including one-way and two-way ANOVA tests. It outlines the goals of understanding when to use ANOVA, different ANOVA designs, how to perform single-factor hypothesis tests and interpret results, conduct post-hoc multiple comparisons procedures, and analyze two-factor ANOVA tests. The key aspects covered include partitioning total variation into between-group and within-group variation, calculating sum of squares, mean squares, and F statistics to test for differences between group means. Post-hoc procedures like Tukey-Kramer are also introduced to determine which specific group means are significantly different from each other.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 12: Analysis of Variance
12.1: One-Way ANOVA
Version 8 of SigmaXL statistical software includes several new features that make multiple comparisons easier. It adds Analysis of Means charts for comparing normal, binomial, and Poisson distributions in one-way and two-way settings. It also improves multiple comparisons procedures for one-way ANOVA, adds tests for equal variances, improves chi-square tests and associations, and includes new descriptive statistics, templates, and calculators.
T test, independant sample, paired sample and anovaQasim Raza
The document discusses various statistical analyses that can be performed in SPSS, including t-tests, ANOVA, and post-hoc tests. It provides details on one-sample t-tests, independent t-tests, paired t-tests, one-way ANOVA tests, and evaluating assumptions like normality. Examples are given on how to conduct these tests in SPSS and how to interpret the output. Guidance is provided on follow-up post-hoc tests that can be used after ANOVA to examine differences between specific groups.
The document discusses hypothesis testing methods for comparing two population or treatment means. It covers notation, sampling distributions, large sample hypothesis testing, confidence intervals, and paired t-tests. An example compares the mean fill volumes of two beer can filling machines and constructs a 98% confidence interval for the difference in tensile strengths of two thread types.
The document provides an overview of regression analysis techniques, including linear regression and logistic regression. It explains that regression analysis is used to understand relationships between variables and can be used for prediction. Linear regression finds relationships when the dependent variable is continuous, while logistic regression is used when the dependent variable is binary. The document also discusses selecting the appropriate regression model and highlights important considerations for linear and logistic regression.
This document discusses various statistical concepts including outliers, transforming data, normalizing data, weighting data, robustness, and homoscedasticity and heteroscedasticity. Outliers are values far from other data points and should be carefully examined before removing. Data can be transformed using logarithms, square roots, or other functions to better fit a normal distribution or equalize variances between groups. Normalizing data puts variables on comparable scales. Weighting data adjusts for under- or over-representation in samples. Robust tests are resistant to violations of assumptions. Homoscedasticity refers to equal variances between groups while heteroscedasticity refers to unequal variances.
The document discusses analysis of variance (ANOVA) which is used to compare the means of three or more groups. It explains that ANOVA avoids the problems of multiple t-tests by providing an omnibus test of differences between groups. The key steps of ANOVA are outlined, including partitioning variation between and within groups to calculate an F-ratio. A large F value indicates more difference between groups than expected by chance alone.
This document discusses quantitative research methods and analysis of variance (ANOVA). It covers one-way ANOVA, which allows comparison of three or more groups, and examples comparing differences between age groups and types of bumpers. Requirements for ANOVA like normality and independence are addressed. Post-hoc tests for identifying specific group differences are also introduced.
1. The document discusses linear correlation and regression between plasma amphetamine levels and amphetamine-induced psychosis scores using data from 10 patients.
2. A positive correlation was found between the two variables, and a linear regression equation was established to predict psychosis scores from amphetamine levels.
3. However, further statistical tests were needed to determine if the correlation and regression model could be generalized to the overall patient population.
The document provides instructions for performing the Mann-Whitney U test and the Chi-squared test. The Mann-Whitney U test can be used to compare two independent groups when the dependent variable is either ordinal or continuous. It involves ranking all observations from both groups together and comparing the sums of the ranks from each group. The Chi-squared test determines if there is a significant association between two categorical variables. It involves calculating expected frequencies and comparing them to observed frequencies using a Chi-squared distribution. Examples are given for performing both tests and interpreting their results.
Correlation & Regression Analysis using SPSSParag Shah
Concept of Correlation, Simple Linear Regression & Multiple Linear Regression and its analysis using SPSS. How it check the validity of assumptions in Regression
The document describes how to conduct and interpret a paired samples t-test in SPSS. It explains that a paired samples t-test is used to compare the means of two related variables measured on the same subjects. It provides an example using reaction time data collected from participants before and after drinking a beer. It outlines the steps to check assumptions, run the t-test in SPSS, and interpret the output, finding that participants had significantly slower reaction times after consuming alcohol.
The document discusses how to use a chi-squared (x2) test to examine differences between observed and expected frequencies of categorical data. It provides guidelines for when a chi-squared test is appropriate, how to perform the calculation, and how to interpret the results. A case study example is presented of a student analyzing questionnaire responses about the 2012 Olympics using a chi-squared test to determine if response frequencies differed significantly between demographic groups.
This document provides an overview of elementary statistics topics including descriptive statistics, inferential statistics, probability, different types of data and scales of measurement, common statistical tests like t-tests, z-tests, F-tests, chi-square tests, ANOVA, correlation, and regression. It also includes examples of how to calculate and interpret descriptive statistics like the mean, median, mode, variance, and standard deviation. Examples are provided on how to set up and conduct hypothesis tests using Excel.
This document provides an overview of descriptive statistics, inferential statistics, and regression analysis using PASW Statistics software. It discusses topics such as frequency analysis, measures of central tendency, hypothesis testing, t-tests, ANOVA, chi-square tests, correlation, and linear regression. The document is divided into multiple parts that cover opening and manipulating data files, descriptive statistics, tests of significance, regression analysis, and chi-square/ANOVA. It also discusses importing/exporting data and using scripts in PASW Statistics.
This document provides an overview of key concepts in inferential statistics. Inferential statistics allows researchers to make inferences about populations based on samples. It includes techniques like hypothesis testing, t-tests, analysis of variance (ANOVA), regression analysis, and more. The goal is to determine if observed differences are statistically significant rather than due to chance. Inferential statistics helps estimate parameters and analyze variability using statistical models and software.
The document discusses hypothesis testing and statistical analysis techniques. It covers univariate, bivariate, and multivariate statistical analysis, which involve one, two, or three or more variables, respectively. The key steps of hypothesis testing are outlined, including deriving a null hypothesis from the research objectives, obtaining and measuring a sample, comparing the sample value to the hypothesis, and determining whether to support or not support the hypothesis based on consistency. Type I and Type II errors in hypothesis testing are defined. Common statistical tests like chi-square, t-tests, ANOVA, and correlation are introduced along with concepts like significance levels, p-values, and degrees of freedom.
Statistical tests help justify if sample results can be applied to a population. ANOVA compares group means and is preferred over t-tests for 3+ groups. It calculates variation between and within groups to obtain an F-ratio. If the F-ratio exceeds its critical value, the null hypothesis that group means are equal is rejected, showing group means differ significantly. Two-way ANOVA extends this to consider two factors' influence, computing interaction effects between factors.
This document provides an overview of biostatistics and statistical methods used for medical and biological data. It discusses topics including descriptive statistics, statistical inference through estimation, hypothesis testing, and confidence intervals. Specific statistical tools covered include correlation, regression, chi-square tests, multivariate techniques like PCA and clustering, and time series analysis. Examples are provided for hypothesis testing on means and proportions. The document defines key biostatistical concepts like variables, parameters, statistics, and sampling distributions.
Statistical tests of significance and Student`s T-TestVasundhraKakkar
Statistical tests of significance is explained along with steps involve in Statistical tests of significance and types of significance test are also mentioned. Student`s T-Test is explained
LEARNING OUTCOMESKnow what descriptive statistics are an.docxsmile790243
LEARNING OUTCOMES
Know what descriptive statistics are and why they are used
Create and interpret tabulation tables
Use cross-tabulations to display relationships
Perform basic data transformations
Understand the basics of testing hypotheses using inferential statistics
Z test
14–*
*
The Nature of Descriptive AnalysisDescriptive Analysis
The elementary transformation of raw data in a way that describes the basic characteristics such as central tendency, distribution, and variability.Histogram
A graphical way of showing a frequency distribution in which the height of a bar corresponds to the observed frequency of the category.
14–*
*
EXHIBIT 14.1 Levels of Scale Measurement and Suggested Descriptive Statistics
14–*
*
Cross-TabulationCross-Tabulation
Addresses research questions involving relationships among multiple less-than interval variables.
Results in a combined frequency table displaying one variable in rows and another variable in columns.Contingency Table
A data matrix that displays the frequency of some combination of responses to multiple variables.Marginals
Row and column totals in a contingency table, which are shown in its margins.
20–*
*
Cross-Tabulation Table
Did you watch the movie Into The Woods? Yes No
What’s your gender? Male Female
(Observed distribution)
*NoYesTotalMale14317Female151732Total292049
Cross-tab: Project Assignment Thirty respondents were asked if they have the access to the 4G network and if they have used mobile banking services. The results showed that 11 people do not have the access to 4G and have not used mobile banking, 4 people have the access to 4G but have not used mobile banking, 12 people have the access to 4G and have used mobile banking, and 3 people do not have the access to 4G but have used mobile banking (using friends’ smartphone).
Present the results in a cross-tabulation table in Project Assignment.
14–*
*
Cross-Tabulation TableConvert frequency table to percentage table.
Statistical base – the number of respondents or observations (in a row or column) used as a basis for computing percentages.
What was the percentage of males who watched the movie?
What was the percentage of moviegoers who were male?
*
*
Cross-Tabulation Table
% of males watched the movie.
% of
moviegoers
were male.
*NoYesTotal (base)Male3/17=
17.6%FemaleTotalNoYesTotalMale3/20=15%FemaleTotal (base)
Compare these two tables, which one does a better job displaying the relationship between gender and movie going, i.e., if moviegoers’ gender will affect whether they watched the movie.
*
Cross-Tabulation TablePercentages are computed in the direction of the “independent” variable, e.g., gender.
*NoYesTotalMale14/29=48%3/20=15%Female15/29=52%17/20=85%Total29/29=100%20/20=100%
Note that as you have learned from Ch. 9 Experiments and project assignment, gender is NOT an independent variable, because we cannot alter or manipulate part ...
Chi square test social research refer.pptSnehamurali18
This document discusses various statistical tests, including parametric tests that require normally distributed data like t-tests and ANOVA, non-parametric tests that don't require normality like the Mann-Whitney U test, and the chi-square test. It explains that chi-square is used to determine if there is a relationship between two categorical variables by comparing observed and expected frequencies in a contingency table. It provides steps for conducting a chi-square test including stating hypotheses, calculating expected values, determining degrees of freedom, finding the test statistic, and interpreting results. Two examples of applying chi-square to test associations between disease prevalence and other factors are also presented.
For more classes visit
www.snaptutorial.com
1
To make tests of hypotheses about more than two population means, we use the:
t distribution
normal distribution
chi-square distribution
analysis of variance distribution
Researchers use several tools and procedures for analyzing quantitative data obtained from different types of experimental designs. Different designs call for different methods of analysis. This presentation focuses on:
T-test
Analysis of variance (F-test), and
Chi-square test
For more classes visit
www.snaptutorial.com
1
To make tests of hypotheses about more than two population means, we use the:
t distribution
normal distribution
chi-square distribution
analysis of variance distribution
2
You randomly select two households and observe whether or not they own a telephone answering machine. Which of the following is a simple event?
At most one of them owns a telephone answering machine.
This document provides a 100-question practice exam for the QNT 275 final exam. It covers topics in statistics including hypothesis testing, data types, measurement scales, sampling, descriptive statistics, and inferential statistics. Sample questions are multiple choice and cover topics like hypothesis tests, measurement scales, sampling methods, descriptive vs inferential statistics, and data analysis techniques like ANOVA. The practice exam allows students to test their understanding of key statistical concepts.
Here are the key differences between supervised and unsupervised learning:
Supervised Learning:
- Uses labeled examples/data to learn. The labels provide correct answers for the learning algorithm.
- The goal is to build a model that maps inputs to outputs based on example input-output pairs.
- Common algorithms include linear/logistic regression, decision trees, k-nearest neighbors, SVM, neural networks.
- Used for classification and regression predictive problems.
Unsupervised Learning:
- Uses unlabeled data where there are no correct answers provided.
- The goal is to find hidden patterns or grouping in the data.
- Common algorithms include clustering, association rule learning, self-organizing maps.
-
DirectionsSet up your IBM SPSS account and run several statisti.docxjakeomoore75037
Directions:
Set up your IBM SPSS account and run several statistical outputs based on the "SPSS Database" Use "Setting Up My SPSS" to set up your SPSS program on your computer or device. You may also use programs such as Laerd Statistics or Intellectus, if you subscribe to them.
The patient outcome or dependent variables and the level of measurement must be displayed in a comparison table which you will provide as an Appendix to the paper. Refer to the "Comparison Table of the Variable's Level of Measurement."
Submit a 1,000-1,250 word data analysis paper outlining the procedures used to analyze the parametric and non-parametric variables in the mock data, the statistics reported, and a conclusion of the results.
Provide a conclusive result of the data analyses based on the guidelines below for statistical significance.
PAIRED SAMPLE T-TEST: Identify the variables BaselineWeight and InterventionWeight. Using the Analysis menu in SPSS, go to Compare Means, Go to the Paired Sample t-test. Add the BaselineWeight and InterventionWeight in the Pair 1 fields. Click OK. Report the mean weights, standard deviations, t-statistic, degrees of freedom, and p level. Report as t(df)=value, p = value. Report the p level out three digits.
INDEPENDENT SAMPLE T-TEST: Identify the variables InterventionGroups and PatientWeight. Go to the Analysis Menu, go to Compare Means, Go to Independent Samples tT-test. Add InterventionGroups to the Grouping Factor. Define the groups according to codings in the variable view (1=Intervention, 2 =Baseline). Add PatientWeight to the test variable field. Click OK. Report the mean weights, standard deviations, t-statistic, degrees of freedom, and p level. Report t(df)=value, p = value. Report the p level out three digits
CHI-SQUARE (Independent): Identify the variables BaselineReadmission and InterventionReadmission. Go to the Analysis Menu, go to Descriptive Statistics, go to Crosstabs. Add BaselineReadmission to the row and InterventionReadmission to the column. Click the Statistics button and choose Chi-Square. Select eta to report the Effect Size. Click suppress tables. Click OK. Report the frequencies of the total events, the chi-square statistic, degrees of freedom, and p level. Report ꭓ2 (df) =value, p =value. Report the p level out three digits.
MCNEMAR (Paired): Identify the variables BaselineCompliance and InterventionCompliance. Go to the Analysis Menu, go to Descriptive Statistics, go to Crosstabs. Add BaselineCompliance to the row and InterventionCompliance to the column. Click the Statistics button and choose Chi-Square and McNemars. Select eta to report the Effect Size. Click suppress tables. Click OK. Report the frequencies of the events, the Chi-square, and the McNemar’s p level. Report (p =value). Report the p level out three digits.
MANN WHITNEY U: Identify the variables InterventionGroups and PatientSatisfaction. Using the Analysis Menu, go to Non-parametric Statistics, go to LegacyDialogs, go to 2 I.
This document discusses statistical models and inferential statistics. It defines statistical modeling as using mathematical tools and statistical conclusions to understand real-life situations. There are three main types of statistical models: parametric models which have known parameters; nonparametric models which have flexible parameters; and semi-parametric models which are a blend of the two. Inferential statistics are used to draw conclusions about populations based on samples, while descriptive statistics describe sample characteristics. Common inferential statistics techniques include hypothesis testing, regression analysis, z-tests, t-tests, f-tests, and confidence intervals.
The document discusses various statistical tests for analyzing relationships between variables, including tests for statistical independence, chi-square tests, and analysis of variance (ANOVA). It explains that statistical independence is when the probability of two variables occurring together equals the product of their individual probabilities. Chi-square tests compare observed and expected frequencies to test if variables are independent. ANOVA decomposes variance and can test if population means are equal. It distinguishes explained from unexplained variance.
Pampers CaseIn an increasingly competitive diaper market, P&G’.docxbunyansaturnina
Pampers Case
In an increasingly competitive diaper market, P&G’s marketing department wanted to formulate new approaches to the construction and marketing of Pampers to position them effectively against Hugggies without cannibalizing Luvs. They surveyed 300 mothers of infants. Each was given a randomly selected brand of diaper (either Pampers, Luvs, or Huggies) and asked to rate that diaper on nine attributes and to give her overall preference for the brand. Preference was obtained on a 7-point Likert scale (1=not at all preferred, 7=greatly preferred). Diaper ratings on the nine attributes were also obtained on 7-point Likert scale (1=very unfavorable, 7=very favorable). The study was designed so that each of the three brands appeared 100 times. The goal of the study was to learn which attributes of diapers were most important in influencing purchase preference (Y). The nine attributes used in study were:
Variable
Attribute
Marketing options
X1
count per box
Desire large counts per box?
X2
price
Pay a premium price?
X3
value
Promote high value
X4
skin care
Offer high degree of skin care
X5
style
Prints/color vs. plain diapers
X6
absorbency
Regular vs. superabsorbency
X7
leakage
Narrow/tapered vs. regular crotch
X8
comfort/size
Extra padding and form-fitting gathers
X9
taping
Re-sealable tape vs. regular tape
Question (will be discussed in week 8):
If you don’t have SPSS software at home, you may be able to download a trial version (good for 21 days) from spss.com(software(statistics family(PASW statistics 17.0(click “free trial” and download.
1. Run a regression analysis for brand preference that includes all independent variables in the model, and describe how meaningful the model is. Interpret the results for management.
6. Correlation and Regression
*
The mean, or average value, is the most commonly used measure of central tendency. The mean, ,is given by
Where,
Xi = Observed values of the variable X
n = Number of observations (sample size)
The mode is the value that occurs most frequently. It represents the highest peak of the distribution. The mode is a good measure of location when the variable is inherently categorical or has otherwise been grouped into categories.
Statistics Associated with Frequency Distribution Measures of Location
X
=
X
i
/
n
S
i
=
1
n
X
*
The median of a sample is the middle value when the data are arranged in ascending or descending order.
http://www.city-data.com/
Statistics Associated with Frequency Distribution Measures of Location
*
Skewness. The tendency of the deviations from the mean to be larger in one direction than in the other. It can be thought of as the tendency for one tail of the distribution to be heavier than the other.
Kurtosis is a measure of the relative peakedness or flatness of the curve defined by the frequency distribution. The kurtosis of a normal distribution is zero. If.
Similar to Elementary statistics for Food Indusrty (20)
This document summarizes key aspects of freezing and thawing fish and seafood. It discusses the fundamentals of freezing and thawing processes, including that freezing fish below 0°C stops microbial growth and chemical reactions. Quality changes in frozen fish muscle can include protein denaturation and changes in lipids and fatty acids. The length of safe storage depends on factors like storage temperature, packaging, and the fish's condition prior to freezing. Maintaining high quality from catch to freezer to consumer requires controlling all stages of processing and storage.
This document outlines the 12 steps of HACCP (Hazard Analysis and Critical Control Points) for canned pineapple sliced in syrup production. It includes assembling a HACCP team, describing the product, constructing a flow diagram, identifying hazards at each process step, determining critical control points, establishing monitoring and record keeping procedures, and validating that the HACCP system is working as intended. A hazard analysis is provided that analyzes biological, chemical and physical hazards for each step of canned pineapple production from receiving to distribution.
This document discusses the impacts of food safety standards on Thailand's processed animal-based export industries, with a focus on the poultry industry. It provides an overview of food safety standards in Thailand and concerns for the poultry product. Achieving food safety certification provides benefits like increased market share but small food industries in Thailand face constraints in implementation like costs and lack of knowledge. The document recommends strengthening links along the food chain and having a separate organization focus on horizontal food safety issues for the whole country, especially regarding practices for animal feeding, farming, and meat inspection.
Impacts of food safety standards on processed (case study Thailand)Atcharaporn Khoomtong
This document discusses food safety standards for processed animal exports in Thailand. It provides an overview of food safety standards globally and in Thailand, where standards like GMP, HACCP, ISO 9000 and ISO 22000 are implemented. Food safety is important for consumer confidence and market access. Thailand exports many food products, like seafood, chicken, fruits and vegetables. Achieving food safety certification helps Thai exports and small businesses face challenges in implementation. The document recommends strengthening food safety systems and concludes that international standards influence Thailand's regulations to assure safe and competitive food exports.
This document discusses food safety standards for processed animal exports in Thailand. It provides an overview of food safety standards in Thailand and concerns for the poultry industry. It examines the current situation of food safety in Thailand's export poultry industry, including regulations, certification processes, and case studies of hazards. It also discusses how flooding could impact the poultry industry and the benefits and constraints of achieving food safety certification for small and medium enterprises in Thailand. Recommendations are made to strengthen food safety.
The document discusses quality-related changes in frozen fish muscle. It explains that the quality of frozen fish is directly related to the quality of the fresh fish before freezing. Several factors can affect quality during freezing and storage, including fish species, handling and storage temperatures, and freezing and thawing methods. Over time in frozen storage, proteins in the muscle can denature and enzymes can break down compounds like TMAO, affecting texture, moisture content, and flavor. Maintaining consistently low freezing temperatures is key to minimizing quality changes over the maximum recommended storage period for a given fish species.
This document provides an overview of establishing a Hazard Analysis Critical Control Point (HACCP) system for canned pineapple sliced in syrup. It outlines assembling a HACCP team, describing the product and intended use, constructing a flow diagram, identifying hazards at each process step, determining critical control points, and establishing monitoring, corrective actions, and record keeping procedures. The key hazards identified include pathogenic bacteria from raw materials and equipment as well as chemical contamination from fertilizers, lubricants, and cleaning chemicals. Critical control points are established to control these hazards through measures such as supplier approval, equipment sanitation, and process monitoring.
FDA oversees product recalls to protect public health. It first learns of potential product issues through company reports, inspections, or reports to the CDC. FDA then works with companies on voluntary recalls and publicizes high risk recalls through press releases and updates on its website. It monitors companies' corrective actions until recalls are complete to ensure unsafe products are removed from the market.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...EduSkills OECD
Andreas Schleicher, Director of Education and Skills at the OECD presents at the launch of PISA 2022 Volume III - Creative Minds, Creative Schools on 18 June 2024.
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
2. Introduction
Statistical
methodology
Step of scientific research
Important parametric tests
Important nonparametric tests
Example using Excel program
Using Excel for Statistics in Gateway
Cases – Office 2007
Elementary statistics
2
3. Most people become familiar with probability and
statistics through radios, television,newspapers and
magazines.For example,the following statements
were found in newspapers.
Eating 10 grams(g) of fiber a day reduce the risk
of heart attack by 14%
Thirty minutes (of exercise) two or three times
each week can raise HDLs 10 to 15%
Elementary statistics
3
4. Statistics
is used to analyze the results of
surveys and as a tool in scientific research to
make decisions based on controlled
experiments.
Other uses of statistics include operations
research, quality control, estimation and
prediction.
Elementary statistics
4
6. as
the basis of data analysis are concerned with two
basic types of problems
(1) summarizing, describing, and exploring the data
This problems is covered by descriptive statistics
(2) using sampled data to infer the nature of the
process which produced the data
This problems is covered by inferential statistics.
Elementary statistics
6
7. Statistics
plays an important role in the
description of mass phenomena.
Organized and summarized for clear
presentation for ease of communications.
Data may come from studies of populations
or samples
It offers methods to summarize a collection
of data. These methods may be numerical or
graphical, both of which have their own
advantages and disadvantages.
Elementary statistics
7
8. Inferential
statistics is used to draw
conclusions about a data set.
Usually this means drawing inferences about
a population from a sample either by
estimating some relationships or by testing
some hypothesis.
A Population is the
set of all possible
states of a random
variable. The size of
the population may
be either infinite or
finite.
Elementary statistics
A Sample is a subset
of the population; its
size is always finite.
8
9. Descriptive Statistics
Graphical
Inferential Statistics
Confidence interval
Arrange data in tables
Compare means of two
Bar graphs and pie charts
samples
Numerical
t Test
Percentages
Averages
Range
Relationships
Correlation coefficient
Regression analysis
F -Test
Compare means from
three samples
Pre/post (LSD,DMRT)
ANOVA = analysis of
variance
F -Test
10. Another important aspect of data analysis is the Data,
which can be of two different types:
qualitative data ex. Sex, color, smell, taste etc.
quantitative data ex. Height, weight, percentage etc.
Qualitative data does not contain quantitative
information.
Qualitative data can be classified into categories.
Elementary statistics
10
11. Type of Scale
Possible Statements
Allowed
Operators
Examples
nominal scale
identity, countable
=, ≠
colors, phone
numbers,
feelings
ordinal scale
identity, less
than/greater than
relations, countable
=, ≠, <, >
soccer league
table, military
ranks, energy
efficiency
classes
interval scale
identity, less
than/greater than
relations, equality of
differences
=, ≠ , <,
-
dates (years),
temperature in
Celsius, IQ scale
ratio scale
identity, less
than/greater than
relations, equality of
differences, equality
of ratios, zero point
=, ≠ , <,
-
velocities,
lengths,
temperatur in
Kelvin, age
Elementary statistics
11
12. Collecting the
necessary
Analyzing the facts
facts
Inference Statistics
Descriptive Statistics
Assessing
the results
Elementary statistics
Making decisions
Carrying out
decisions
12
13. Mode
=The most frequent value
Median =The value of the middle point of the ordered
measurements
Mean =The average (balancing point in the distribution)
Variance= The average of the squared deviations of all
the population measurements from the
population mean
Standard deviation =The square root of the variance
16. Hypothesis
= a assumption or some supposition
to be proved or disproved.
“the automobile A is performing
as well as
automobile B.”
17. Null
hypothesis (H0 ) =expresses no difference
Often said
“H naught”
H0:
=0
Or any number
Later…….
H0: 1 = 2
Alternative
hypothesis (H1 )
H0:
= 0; Null Hypothesis
HA:
= 0; Alternative Hypothesis
Elementary statistics
17
18. Type I error (α) :
reject H0 | H0 true
Type II error (β) :
Accept H0 | H1 true
Elementary statistics
18
19. Calculated F value is greater than the critical F values
Significant >>>reject H0
Calculated F value is lower than the critical F values
Non Significant >>>accept H0
Elementary statistics
19
20. Truth
H0 Correct
HA Correct
Decide H0
“fail to reject H0”
1- α
True Negative
β
False Negative
Decide HA
“reject H0”
α
False Positive
1- β
True Positive
Data
α = significance level
1- β = power
22. Z - test
is based on the normal probability
distribution and is used for judging the
significance of several statistical
z-test is generally used for comparing the mean of sample to
measures, particularly the mean. a(n>30)
some hypothesized mean for the population in case of large sample
23. T – test
is based on t-distribution and is considered an appropriate
test for judging the significance of a sample mean or for
judging the significance of difference between the means
of two samples in case of small sample(s) when population
variance is not known (in which case we use variance of
the sample as an estimate of the population variance).
t-test applies only in case of small sample(s)
when population variance is unknown.
Unknown variance
Under H0
X
0
s/ n
~ t( n 1)
Critical values: statistics books or computer
t-distribution approximately normal for degrees of freedom (df) >30
Elementary statistics
23
24. F – test
is based on F-distribution and is used to compare the variance of
the two-independent samples. This test is also used in the context
of analysis of variance (ANOVA) for judging the significance of
more than two sample means at one and the same time.
Test statistic, F, is calculated and compared with its probable value
(to be seen in the F-ratio tables for different degrees of freedom for
greater and smaller variances at specified level of significance) for
accepting or rejecting the null hypothesis.
Elementary statistics
24
25. Anova tables:
for a 1-way anova with N observations and T treatments.
Source
df
treatment
(T-1)
error…………by subtraction
Total
(N-1)
SS
SStrt
Sserr
MS
F
=SStrt/(T-1) MStrt/MSerr
=SSerr/dferr
Finally, you (or the PC) consult tables or otherwise obtain a probability of
obtaining this F value given df for treatment and error.
26. 1: Calculate N, Σx, Σx2 for the whole dataset.
2: Find the Correction factor
CF = (Σx * Σx) /N
3: Find the total Sum of Squares for the data
= Σ(xi2) – CF
4: add up the totals for each treatment in turn (Xt.), then
calculate Treatment Sum of Squares
SStrt = Σt(Xt.*Xt.)/r - CF
where Xt. = sum of all values within treatment t, and r is
the number of observations that went into that total.
3: Draw up ANOVA table, getting error terms by subtraction.
27. Complete
@LSD
Randomize Design : Least
(CRD)
Randomize Complete Block
@DMRT:
Design (RBD)
Duncan’s New
Multiple Range
Latin Square (LQ)
Test
Treatments
Replication
Degree of freedom
(df)
Significant
Difference
Elementary statistics
27
28. Most
people have difficulties in determining
whether a model is linear or non-linear.
Before discussing the issues of linear vs. nonlinear systems, let's have a short look at
some examples, displaying several types of
discrimination lines between two classes:
Nonlinear
linear
Elementary statistics
28
29. Here's
the answer: linear models are linear
in the parameters which have to be
estimated, but not necessarily in the
independent variables.
This explains why the middle of the three
figures above shows a linear discrimination
line between the two classes, although the
line is not linear in the sense of a straight
line.
Elementary statistics
29
30. When
calculating a regression model, we are
interested in a measure of the usefulness of
the model.
There are several ways to do this, one of
them being the coefficient of determination
(also sometimes called goodness of fit).
The concept behind this coefficient is to
calculate the reduction of the error of
prediction when the information provided by
the x values is included in the calculation.
Elementary statistics
30
31. Thus
the coefficient of determination specifies
the amount of sample variation in y explained
by x.
For simple linear regression the coefficient of
determination is simply the square of the
correlation coefficient between Y and X .
Strong negative
Linear relationship
Strong positive
Linear relationship
-1
0
Elementary statistics
No Linear relationship
31
+1
32. also
called Pearson's product moment
correlation after Karl Pearson is calculated
by
The correlation coefficient may take any value between -1.0 and +1.0.
Assumptions:
linear relationship between x and y
continuous random variables
both variables must be normally distributed
x and y must be independent of each other
Elementary statistics
32
34. test
is based on chi-square distribution and as a parametric test
is used for comparing a sample variance to a theoretical
population variance.
where
= variance of the sample;
= variance of the population;
(n – 1) = degrees of freedom,
n being the number of items in the sample.
36. In
quality control, there are situations when
we need to know whether a sample mean lies
within the confidence limits of the entire
population. This can be accomplished by
using t-distribution to determine confidence
limits for a population mean using a selected
probability.
We will use Excel function TINV( ) to determine the t-distribution.
Elementary statistics
36
E
X
A
M
P
L
E
I
37. Ten cans of sliced pineapple were removed at
random from a population of 1000 cans. The
drained weight of the contents were
measured as 410.5, 411.4, 410.4, 412.6,
411.9, 411.5,412.5, 411.4, 411.5, 410.1 g.
Determine the 95% confidence limits for the
entire population.
Elementary statistics
37
38. We will first calculate the average of the ten
data values using the AVERAGE() function.
Next we will determine the standard
deviation of the sample mean using STDEV()
function. Then we will use the following
expression to estimate the lower and upper
limits of population mean
Elementary statistics
38
39. Discussion:
The results show that the 95% confidence lower
and upper limits for the population mean are
410.78 and 411.98, respectively.
Elementary statistics
39
40. When a sample is taken from a large
population and analyzed for selected
DATA, statistical analysis is helpful in
obtaining estimates for the total population
from which the sample was obtained. In this
worksheet.
We will use Excel's built-in data analysis techniques to determine
various statistical descriptors for the sample and the population.
Elementary statistics
40
E
X
A
M
P
L
E
II
41. Case study : Color Data
A
sample of 10 breads is obtained from a
conveyor belt exiting a baking oven. The
breads are analyzed for color by comparing
them with a standard color chart. The values
recorded, in customized color units, are as
follows:
34, 33, 36,37, 31, 32, 38, 33, 34, and 35.
Estimate the mean, variance,
and standard deviation of the population.
Elementary statistics
41
42. We will use the Data Analysis capability of
Excel in determining the descriptive
statistics for the given data. First, you should
make sure that Data Analysis... is available
under the menu command Tools. If it is not
available, then see Next slide for details on
how to add this analysis package.
Elementary statistics
42
43. Click
Microsoft Office Button , and Then
Click Excel Options
Click Add-ins. In Manage Box, Select Excel
Add-ins
Click Go
In the Add-Ins Available Box, Select Analysis
ToolPak Check Box and Click OK. (If ToolPak
Is Not Listed, Click Browse to Locate It.)
43
44. Step 1 Open a new worksheet expanded to full size.
Step 2 In cells A2 :A 11, type the text labels and data values
Elementary statistics
44
45. Step 3 Choose the menu items Data, Data Analysis ....
A dialog box will open as shown.
Step 4 Double click on Descriptive Statistics.
Elementary statistics
45
46. Step 5 In the edit box for Input Range:, type the range of
cells as SA$2:$A$11.
Step 6 Select the radio button Columns.
Step 7 In output range type A13. Click OK.
Step 8 Excel will calculate the descriptive statistics and
display results in cells A13:B28
@The results indicate that the
sample mean is 34.3.
@The standard deviation for
the population is 2.214, and
@the sample variance of the
population is 4.9
Elementary statistics
46
47. t
(difference between samples) / (variability)
Excel will automatically calculate t-values to
compare:
Means of two datasets with equal variances
Means of two datasets with unequal variances
Two sets of paired data
abs(t-score)
< abs(t-critical): accept H0
Insufficient evidence to prove that observed
differences reflect real, significant differences
47
48. A
researcher wishes to test whether heavy
metal in soil have different mean after war
threat versus before war threat. The heavy
metal in soil is that mean after war threat
will exceed mean before war threat
Use Excel to help test the hypothesis for the difference
in population means.
Elementary statistics
48
E
X
A
M
P
L
E
III
49. Step 1 Open a new worksheet expanded to full size.
Step 2 In cells B5 :C19, type the text labels and data values
The null and
hypothesis to be
test are:
Ho :
HA :
Elementary statistics
1
2
1
2
49
0.0
0.0
50. Step 3 Choose the menu items Tools, Data Analysis ....
A dialog box will open as shown.
Step 4 Double click on t-Test two-sample assuring equal variances.
Elementary statistics
50
52. t > tcritical(one-tail), so the
mean of sample #1 is
significantly larger than
the mean of sample #2.
Change this if you want to know
whether the means of the two
samples differ by at least some
specified amount.
p value for one tailed
test is .003 which is
less than .05 so we
reject the null
hypothesis.
t > tcritical(two-tail), so
the mean of sample #1
is significantly
different from the mean
of sample #2.
Elementary statistics
p value for Two-tail test is
.007 which is less than .05 so
we reject the null hypothesis.
52
53. In
hypothesis testing, it is sometimes not
possible to use the same judges for testing
different treatments. Although, it would be
desirable to use the same judges to evaluate
samples obtained from different treatments.
In
such cases, we have a completely
randomized design. Using single-factor ANOVA
We can test to see whether the treatments had any influence on the
judges scores; in other words, does the mean of each treatment differ?
Elementary statistics
53
E
X
A
M
P
L
E
IV
54. Case study : Weight of oranges Data
a weight of oranges from three
different suppliers A, B, and C .Five oranges
was random sampling and weighted. The
following weights were obtained:
Consider
A
B
C
150
148
146
151
150
148
152
152
150
153
154
152
154
156
154
Elementary statistics
54
55. For each treatment, 5 samples were weighted by
5 times. Therefore, the design was completely
randomized. Calculate the F value to determine
whether the means of three treatments are
significantly different.
Elementary statistics
55
56. We
will use a single factor analysis of variance
available in Excel. We will determine the F
value at probability of 0.95 .
These computations will allow us to determine
if the means between the three different
treatments are significantly different.
First make sure that the Data Analysis...
Command is available under menu item Data.
Elementary statistics
56
57. Step 1 Open a new worksheet expanded to full size.
Step 2 In cells A4 :C8, type the text labels and data values
Elementary statistics
57
58. Step 3 Choose the menu items Data, Data Analysis ....
A dialog box will open as shown.
Step 4 Double click on Anova Single Factor.
Elementary statistics
58
59. The results show that the F value is 0.889. The critical F
values are At the 5% level F = 3.885
This indicates that for the example problem the F value is lower than
the value at the 5% level but not at the 5% level. Thus, we can
say that no significant difference in their mean scores(P<0.05).
Elementary statistics
59
60. When
we are interested in evaluating samples
for sensory characteristics using same judges
with
samples
obtained
from
multiple
treatments, analysis of variance for a twofactor design without replication is useful.
This analysis helps in determining if there are
significant differences among the various
treatments as well as if an significant
differences exist among the judges themselves.
Elementary statistics
60
E
X
A
M
P
L
E
V
61. Three
types of ice cream were evaluated by
11 judges. The judges assigned the following
scores.
Judge
Ice Cream A
Ice Cream B
Ice Cream C
A
16
14
15
B
17
15
17
C
16
16
16
D
18
14
16
E
16
14
14
F
17
16
17
G
18
14
15
H
16
15
16
I
17
14
14
J
18
13
16
K
17
15
15
Elementary statistics
61
62. We
will use the built-in analysis pack
available in the Excel command called Data
Analysis ....
Three sets of results will be obtained for the
5% level
Elementary statistics
62
63. Step 1 Open a new worksheet expanded to full size.
Step 2. In cell A3 :D 13, type the text labels and
data values,
Elementary statistics
63
64. Step 3 Choose the menu items Data, Data Analysis ....
A dialog box will open.
Step 4 Double click on Anova: Two-Factor Without
Replication. A new dialog box will open.
Step 5 Type entries in edit boxes as shown.
Step 6. The results will be displayed in cells
Elementary statistics
64
65. For judges, the calculated F value is
1.36. This value is lower than the critical
F values of 2.35 at the 5 % level
Elementary statistics
The difference
among ice cream
types is determined
by examining the F
values. The F value
is calculated as
19.73. This value is
greater than 3.49 for
the 5% level
65
66. The
difference among ice cream types is
determined by examining the F values. The F
value is calculated as 19.73. This value is
greater than 3.49 for the 5% level,
The ice cream types are significantly
different at p<0.001.
For judges, the calculated F value is 1.36.
This value is lower than the critical F values
of 2.35 at the 5 % level.
The judges showed no significant difference
in their mean scores.
Elementary statistics
66
67. Simple
regression analysis involves determining
the statistical relationship between two
variables. One of the uses of such analysis is in
predicting one variable on the basis of the
other.
We will use the regression analysis available in
the Add-in package in Excel to determine linear regression
between two variables.
Elementary statistics
67
E
X
A
M
P
L
E
VI
68. Case study : Sensory scores Data
flavor with storage time in a frozen
vegetable. Sensory scores obtained at
0, 1, 2, 3, 4 and 6 month times were
1.5, 2, 2, 3,
2.5, and 3.5, respectively. Assuming that
these data can be linearly
correlated, determine the regression
coefficient and
predict the off-flavor score at 5 months of
storage.
Elementary statistics
68
69. We will use the package Regression available
as an Add-in item in Excel. We will use this
package to obtain required statistical
relationships. We assume that a linear
relationship exists between the off-flavor
score and time (in months) with the equation
y= mx+b,
where
y is off-flavor score, x is time in months, m is slope and
b is intercept.
Elementary statistics
69
70. Step 1 Open a new worksheet expanded to full size.
Step 2 In cells A4 :B9, enter the text labels and data values
Elementary statistics
70
71. Step 3 Choose the menu items Data, Data Analysis .... A dialog box will
open.
Step 4 Double click on Regression.
Step 5 A new dialog box will open. Enter the range of cells for Y and X as
shown. Check boxes for Residuals and Line Fit Plots. Click OK.
Elementary statistics
71
72. The results will
be displayed
~99% of the variation in y is explained by
variation in x. The remainder may be
random error, or may be explained by
some factor other than x.
Probability of
getting this value of
F by randomly
sampling from a
normally distributed
population. Low
value means model
(rather than random
variability) explains
most variation in
data.
y = 0.31 x + 1.58
Ratio of variability explained
by model to leftover
variability. High number
means model explains most
variation in data.
Probability of getting a slope or intercept this
much different from zero by randomly sampling
from a normally-distributed population.
Elementary statistics
Confidence limits on
slope and intercept.
72
73. The
r 2 value is calculated as 0.85, the
standard error is 0.318.The intercept is 1.5786
and the slope is 0.3143.
The linear equation is y = 0.31x + 1.58 . The
residual output gives the predicted values for
the off-flavor score at different time intervals.
These data are also shown in the chart.
The predicted and calculated values are shown.
The predicted value at 5 months of storage
duration is calculated as 3.13.
Elementary statistics
73
77. Click
Microsoft Office Button , and Then
Click Excel Options
Click Add-ins. In Manage Box, Select Excel
Add-ins
Click Go
In the Add-Ins Available Box, Select Analysis
ToolPak Check Box and Click OK. (If ToolPak
Is Not Listed, Click Browse to Locate It.)
77
78. Click
Data/Data Analysis (Far Right) /Descriptive
Statistics & OK.
Put Checkmarks on Summary Statistics, 95% or
99% Confidence Interval, & Labels in First Row
Boxes.
Move Cursor to Input Range Window, Highlight
Data to Analyze including Labels, & Click OK.
Your Data will Appear on New Worksheet.
Widen Columns by Clicking Home/Format/AutoFit
Column Width.
78
79. Click Data/Data Analysis/Histogram & OK.
Put Checkmarks on Chart Output & New Worksheet
Boxes.
Move Cursor to Input Range Window, Highlight Data
Going into Histogram.
Move Cursor to Input Bin Range, Highlight Data
Showing Upper Value of Each Bin & Click OK.
Histogram will be on New Worksheet. You May
Lengthen it by Clicking Blank Space in Window, Moving
Cursor to Window Bottom Line & Holding Down Mouse
Button as You Pull Down Window.
79
80. Go
to Sheet One.
Click Data/Data Analysis/ and the Appropriate
Statistical Test. Then Click OK.
On New Window Check Labels Box and Put
Cursor on Variable 1 Range.
Highlight Variable 1 Data Including Label.
Put Cursor on Variable 2 Range & Highlight
Variable 2 Data (Including Label). Then Click OK.
Click Home/Format/AutoFit/Column Width
80
81. Go
to Sheet One.
Highlight Data (Be Sure X Values are in
Left Column and Y Values are in Right
Column).
Click Insert/Scatter. Pull down menu and
click Upper Left Icon.
Click a Datum Point on Chart with Right
Mouse Key, Add Trendline, & Click Linear.
81
82. Go
to Sheet One.
Click Data/Data Analysis (On Far Right)
/Regression & Click OK.
On New Window Check Labels Box and Put
Cursor on X Range.
Highlight X Data Including Label.
Put Cursor on Y Range & Highlight Y Data
(Including Label), Then Click OK.
Click Home/Format/AutoFit Column Width.
82