The document discusses the chi-square test of goodness of fit, which compares observed data to expected data to determine if the differences are statistically significant. It provides examples of using the chi-square formula to calculate the test statistic and compare it to critical values to evaluate the null hypothesis that the observed and expected data are from the same distribution. A high chi-square value indicates the observed data does not fit the expected distribution well.
1. The document discusses key concepts in inferential statistics including point estimation, interval estimation, hypothesis testing, types of errors, p-values, power, and one-tailed and two-tailed tests.
2. It explains that inferential statistics allows generalization from a sample to a population and includes estimation of parameters and hypothesis testing.
3. Common statistical techniques covered are confidence intervals, which provide a range of values that likely contain the true population parameter, and hypothesis testing, which evaluates theories about populations.
This document discusses measures of central tendency and dispersion. It begins by defining measures of central tendency as statistical measures that describe the position of a distribution. The most commonly used measures of central tendency for a univariate context are the mean, median, and mode. The document then discusses the arithmetic mean in detail, including how to calculate the mean for individual, discrete, and continuous data series using direct and shortcut methods. It also covers the geometric mean and how to calculate it using logarithms for individual, discrete, and continuous data series. Various examples and practice problems are provided.
ANOVA, ANCOVA, MANOVA, and MANCOVA are statistical analyses used to test differences between groups.
ANOVA tests for differences between 2 or more means and partitions variances into sums of squares between and within groups. ANCOVA controls for additional factors, called covariates, to reduce error and increase power.
MANOVA assesses the effect of independent variables on multiple dependent variables simultaneously, accounting for correlation between variables. It tests for overall differences using a multivariate F value. Univariate follow-ups can then examine differences on each individual dependent variable.
MANCOVA extends MANOVA to include controlling for covariates, allowing evaluation of changes in dependent variables while accounting for additional continuous factors measured at different
Multiple regression analysis is a powerful technique used for predicting the unknown value of a variable from the known value of two or more variables.
This presentation explains the concept of ANOVA, ANCOVA, MANOVA and MANCOVA. This presentation also deals about the procedure to do the ANOVA, ANCOVA and MANOVA with the use of SPSS.
Factor analysis is a statistical technique used to identify underlying factors that explain the pattern of correlations within a set of observed variables. It groups variables that are highly correlated with each other into factors to reduce data dimensionality. The key steps are extracting factors with eigenvalues greater than 1, evaluating factor loadings to interpret the grouping of variables, and rotating factors to maximize interpretability of the results. SPSS output includes correlation coefficients, KMO/Bartlett's tests of sampling adequacy, eigenvalues, communalities, scree plots, and rotated component matrices.
This document discusses point estimation and the criteria for a good point estimator. It defines point estimation, estimators, and estimates. The key criteria for a good point estimator are discussed as unbiasedness, consistency, efficiency, and sufficiency. Unbiasedness means the expected value of the estimator is equal to the true parameter value. Consistency means the estimator approaches the true value as the sample size increases. Efficiency refers to the estimator having the minimum possible variance. Sufficiency means the estimator uses all the information in the sample. Examples are provided for each concept.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
1. The document discusses key concepts in inferential statistics including point estimation, interval estimation, hypothesis testing, types of errors, p-values, power, and one-tailed and two-tailed tests.
2. It explains that inferential statistics allows generalization from a sample to a population and includes estimation of parameters and hypothesis testing.
3. Common statistical techniques covered are confidence intervals, which provide a range of values that likely contain the true population parameter, and hypothesis testing, which evaluates theories about populations.
This document discusses measures of central tendency and dispersion. It begins by defining measures of central tendency as statistical measures that describe the position of a distribution. The most commonly used measures of central tendency for a univariate context are the mean, median, and mode. The document then discusses the arithmetic mean in detail, including how to calculate the mean for individual, discrete, and continuous data series using direct and shortcut methods. It also covers the geometric mean and how to calculate it using logarithms for individual, discrete, and continuous data series. Various examples and practice problems are provided.
ANOVA, ANCOVA, MANOVA, and MANCOVA are statistical analyses used to test differences between groups.
ANOVA tests for differences between 2 or more means and partitions variances into sums of squares between and within groups. ANCOVA controls for additional factors, called covariates, to reduce error and increase power.
MANOVA assesses the effect of independent variables on multiple dependent variables simultaneously, accounting for correlation between variables. It tests for overall differences using a multivariate F value. Univariate follow-ups can then examine differences on each individual dependent variable.
MANCOVA extends MANOVA to include controlling for covariates, allowing evaluation of changes in dependent variables while accounting for additional continuous factors measured at different
Multiple regression analysis is a powerful technique used for predicting the unknown value of a variable from the known value of two or more variables.
This presentation explains the concept of ANOVA, ANCOVA, MANOVA and MANCOVA. This presentation also deals about the procedure to do the ANOVA, ANCOVA and MANOVA with the use of SPSS.
Factor analysis is a statistical technique used to identify underlying factors that explain the pattern of correlations within a set of observed variables. It groups variables that are highly correlated with each other into factors to reduce data dimensionality. The key steps are extracting factors with eigenvalues greater than 1, evaluating factor loadings to interpret the grouping of variables, and rotating factors to maximize interpretability of the results. SPSS output includes correlation coefficients, KMO/Bartlett's tests of sampling adequacy, eigenvalues, communalities, scree plots, and rotated component matrices.
This document discusses point estimation and the criteria for a good point estimator. It defines point estimation, estimators, and estimates. The key criteria for a good point estimator are discussed as unbiasedness, consistency, efficiency, and sufficiency. Unbiasedness means the expected value of the estimator is equal to the true parameter value. Consistency means the estimator approaches the true value as the sample size increases. Efficiency refers to the estimator having the minimum possible variance. Sufficiency means the estimator uses all the information in the sample. Examples are provided for each concept.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
Trade-Based Money Laundering: What Compliance Professionals Need to KnowAlessa
ย
WATCH WEBINAR - https://www.caseware.com/alessa/webinars/trade-based-money-laundering/
Hundreds of billions of dollars are laundered every year through trade-based money laundering (TBML). Its sophisticated techniques allow criminals to use legitimate trade to disguise the source of illegal proceeds and transfer value across borders without the use of traditional money movement methods.
Laurie Kelly, CAMS shares her knowledge and experiences gained from 20 years in leading the AML, fraud, and sanctions compliance functions for a $145 billion U.S. financial institution that provided extensive trade finance services for global exports of U.S. agricultural products. Attendees learn the fundamentals of foreign trade and trade finance, and why these long-established processes make it so vulnerable to TBML.
We break down the most common TBML techniques, including the Black Market Peso Exchange, over & under invoicing, and others, using real world case studies. Finally, we review the red flags for these activities and how to incorporate transaction monitoring, sanctioned/restricted party screening, and enhanced customer due diligence to mitigate TBML risks.
About Alessa, a CaseWare RCM product:
Alessa is a financial crime detection, prevention and management solution offered by CaseWare RCM Inc. With deployments in more than 20 countries in banking, insurance, FinTech, gaming, manufacturing, retail and more, Alessa is the only platform organizations need to identify high-risk activities and stay ahead of compliance. To learn more about how Alessa can help your organization ensure compliance, detect complex fraud schemes, and prevent waste, abuse and misuse, visit us at caseware.com/alessa.
Connect with us online:
Visit the Alessa WEBSITE: https://www.caseware.com/alessa/
Follow Alessa on LINKEDIN: https://www.linkedin.com/caseware-alessa
Follow Alessa on TWITTER: https://twitter.com/casewarealessa
SUBSCRIBE to Alessa on YouTube: http://tiny.cc/Alessa
The document discusses Stanley Smith Stevens' theory of measurement scales, which proposes that there are four types of measurement scales - nominal, ordinal, interval, and ratio - that differ in their ability to determine relationships between values and perform mathematical operations. Nominal scales only categorize data, ordinal scales can rank order data, interval scales have equal intervals between values, and ratio scales have a true zero point. Proper selection of a measurement scale depends on research objectives, response types, data properties, and other factors.
This document discusses descriptive statistics and analysis. It provides definitions of key terms like data, variable, statistic, and parameter. It also describes common measures of central tendency like mean, median and mode. Additionally, it covers measures of variability such as range, variance and standard deviation. Various graphical and numerical methods for summarizing and presenting sample data are presented, including tables, charts and distributions.
Regression analysis is a statistical technique for predicting a dependent variable based on one or more independent variables. Simple linear regression fits a straight line to the data to predict a continuous dependent variable (y) from a single independent variable (x). The output is an equation of the form y= b0 + b1x + ฮต, where b0 is the y-intercept, b1 is the slope, and ฮต is the error. Multiple linear regression extends this to include more than one independent variable. Regression analysis calculates the "best fit" line that minimizes the residuals, or differences between predicted and observed y values.
The document provides an overview of multiple linear regression (MLR). MLR allows predicting a dependent variable from multiple independent variables. It extends simple linear regression by incorporating additional predictors. Key points covered include: purposes of MLR for explanation and prediction; assumptions of the method; interpreting R-squared values; comparing unstandardized and standardized regression coefficients; and testing the statistical significance of predictors.
This document discusses various measures of central tendency including the mean, median, and mode. It provides definitions and formulas for calculating each measure. The mean is the sum of all values divided by the number of values and is the most widely used measure. The median is the middle value when data is arranged from lowest to highest. The mode is the value that occurs most frequently. Examples are given demonstrating how to calculate each measure for both individual values and grouped data.
The range is the simplest measure of variability, defined as the difference between the highest and lowest values in a data set. It is quick to calculate but does not provide a full picture of the data distribution and can be strongly influenced by outliers. Other measures of variability include the average deviation, which calculates the average amount each score deviates from the mean, and the interquartile range, which is less influenced by outliers than the range. The interquartile range only considers data between the first and third quartiles and ignores half the data points.
Here are the steps I would take to analyze this data using exploratory factor analysis:
1. Check assumptions
- Sample size of 300 is adequate
- Most correlations are between .3 and .8
2. Extract initial factors using principal axis factoring
- Kaiser's criterion suggests 4 factors with eigenvalues > 1
3. Rotate factors orthogonally using varimax rotation
- This will make the factor structure more interpretable
4. Interpret the factors based on which items have strong loadings
- Factor 1 relates to anxiety about learning SPSS
- Factor 2 relates to anxiety about using computers
- Factors 3 and 4 may reflect other aspects of statistics anxiety
5. Compute factor scores if desired to use in further
This document discusses exploratory factor analysis (EFA). EFA is used to identify underlying factors that explain the pattern of correlations within a set of observed variables. The document outlines the steps of EFA including testing assumptions, constructing a correlation matrix, determining the number of factors, rotating factors, and interpreting the factor loadings. It provides an example of running EFA on a dataset with 11 physical performance and anthropometric variables from 21 participants. The analysis extracts 3 factors that explain over 80% of the total variance.
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
ย
Credit Card Fraudulent Transaction Detection Research Paper using Machine Learning technologies like Logistic Regression, Random Forrest, Feature Engineering and various techniques to deal with highly skewed dataset
This document defines and provides the formula for calculating mean deviation, which is a measure of variation that uses all the scores in a distribution. It is more reliable than range. Mean deviation is calculated by finding the absolute difference between each score and the mean, summing the absolute differences, and dividing by the number of observations. Two examples of calculating mean deviation for sets of data are provided, along with exercises asking students to find the mean deviation of additional data sets and define standard deviation.
The document discusses statistical significance, types of errors, and key statistical terms. It defines statistical significance as the strength of evidence needed to reject the null hypothesis, determined before conducting an experiment. There are two types of errors: type I errors reject a true null hypothesis, type II errors accept a false null hypothesis. Key terms discussed include population, parameter, sample, and statistic.
The document discusses the normal distribution, which produces a symmetrical bell-shaped curve. It has two key parameters - the mean and standard deviation. According to the empirical rule, about 68% of values in a normal distribution fall within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations. The normal distribution is commonly used to model naturally occurring phenomena that tend to cluster around an average value, such as heights or test scores.
Residuals represent variation in the data that cannot be explained by the model.
Residual plots useful for discovering patterns, outliers or misspecifications of the model. Systematic patterns discovered may suggest how to reformulate the model.
If the residuals exhibit no pattern, then this is a good indication that the model is appropriate for the particular data.
Understanding data type is an important concept in statistics, when you are designing an experiment, you want to know what type of data you are dealing with, that will decide what type of statistical analysis, visualizations and prediction algorithms could be used.
#data #data types #ai #machine learning #statistics #data science #data analytics #artificial intelligence
This document discusses multiple regression analysis. It begins by introducing multiple regression as an extension of simple linear regression that allows for modeling relationships between a response variable and multiple explanatory variables. It then covers topics such as examining variable distributions, building regression models, estimating model parameters, and assessing overall model fit and significance of individual predictors. An example demonstrates using multiple regression to build a model for predicting cable television subscribers based on advertising rates, station power, number of local families, and number of competing stations.
This document discusses the chi-square test and how to calculate expected frequencies. It provides an example of using a chi-square test to analyze observed vs expected frequencies of blood types in children with one parent of type A and one of type B. It also gives another example using a chi-square test to analyze the effectiveness of vaccination in preventing smallpox attacks using a 2x2 contingency table.
Logistic regression allows prediction of discrete outcomes from continuous and discrete variables. It addresses questions like discriminant analysis and multiple regression but without distributional assumptions. There are two main types: binary logistic regression for dichotomous dependent variables, and multinomial logistic regression for variables with more than two categories. Binary logistic regression expresses the log odds of the dependent variable as a function of the independent variables. Logistic regression assesses the effects of multiple explanatory variables on a binary outcome variable. It is useful when the dependent variable is non-parametric, there is no homoscedasticity, or normality and linearity are suspect.
This document provides an overview of descriptive statistics techniques for summarizing categorical and quantitative data. It discusses frequency distributions, measures of central tendency (mean, median, mode), measures of variability (range, variance, standard deviation), and methods for visualizing data through charts, graphs, and other displays. The goal of descriptive statistics is to organize and describe the characteristics of data through counts, averages, and other summaries.
Logistic regression is a statistical method used to predict a binary or categorical dependent variable from continuous or categorical independent variables. It generates coefficients to predict the log odds of an outcome being present or absent. The method assumes a linear relationship between the log odds and independent variables. Multinomial logistic regression extends this to dependent variables with more than two categories. An example analyzes high school student program choices using writing scores and socioeconomic status as predictors. The model fits significantly better than an intercept-only model. Increases in writing score decrease the log odds of general versus academic programs.
The document discusses the chi-square test of goodness of fit, which compares observed data to expected data to determine if an observed pattern fits a hypothesized or expected pattern. An example is provided of expected student enrollment numbers across 3 statistics professors' classes versus the actual observed enrollments. The chi-square test statistic is calculated by summing the squared differences between observed and expected values divided by the expected values. The calculated chi-square value is then compared to a critical value to determine if the null hypothesis that there is no difference between observed and expected can be rejected. In this example, the calculated chi-square value exceeds the critical value, so the null hypothesis is rejected.
This document provides an overview of chi-square tests, including chi-square goodness of fit and chi-square test of independence. It uses examples from a hypothetical New York City mayoral election poll to demonstrate how to perform each test. The chi-square goodness of fit test determines if the distribution of proportions in a sample fits the expected distribution. The chi-square test of independence determines if there is a relationship between two categorical variables, like voter gender and candidate preference. Both tests use a chi-square calculation and degrees of freedom to obtain a p-value, and Cramรฉr's V can estimate effect size.
Trade-Based Money Laundering: What Compliance Professionals Need to KnowAlessa
ย
WATCH WEBINAR - https://www.caseware.com/alessa/webinars/trade-based-money-laundering/
Hundreds of billions of dollars are laundered every year through trade-based money laundering (TBML). Its sophisticated techniques allow criminals to use legitimate trade to disguise the source of illegal proceeds and transfer value across borders without the use of traditional money movement methods.
Laurie Kelly, CAMS shares her knowledge and experiences gained from 20 years in leading the AML, fraud, and sanctions compliance functions for a $145 billion U.S. financial institution that provided extensive trade finance services for global exports of U.S. agricultural products. Attendees learn the fundamentals of foreign trade and trade finance, and why these long-established processes make it so vulnerable to TBML.
We break down the most common TBML techniques, including the Black Market Peso Exchange, over & under invoicing, and others, using real world case studies. Finally, we review the red flags for these activities and how to incorporate transaction monitoring, sanctioned/restricted party screening, and enhanced customer due diligence to mitigate TBML risks.
About Alessa, a CaseWare RCM product:
Alessa is a financial crime detection, prevention and management solution offered by CaseWare RCM Inc. With deployments in more than 20 countries in banking, insurance, FinTech, gaming, manufacturing, retail and more, Alessa is the only platform organizations need to identify high-risk activities and stay ahead of compliance. To learn more about how Alessa can help your organization ensure compliance, detect complex fraud schemes, and prevent waste, abuse and misuse, visit us at caseware.com/alessa.
Connect with us online:
Visit the Alessa WEBSITE: https://www.caseware.com/alessa/
Follow Alessa on LINKEDIN: https://www.linkedin.com/caseware-alessa
Follow Alessa on TWITTER: https://twitter.com/casewarealessa
SUBSCRIBE to Alessa on YouTube: http://tiny.cc/Alessa
The document discusses Stanley Smith Stevens' theory of measurement scales, which proposes that there are four types of measurement scales - nominal, ordinal, interval, and ratio - that differ in their ability to determine relationships between values and perform mathematical operations. Nominal scales only categorize data, ordinal scales can rank order data, interval scales have equal intervals between values, and ratio scales have a true zero point. Proper selection of a measurement scale depends on research objectives, response types, data properties, and other factors.
This document discusses descriptive statistics and analysis. It provides definitions of key terms like data, variable, statistic, and parameter. It also describes common measures of central tendency like mean, median and mode. Additionally, it covers measures of variability such as range, variance and standard deviation. Various graphical and numerical methods for summarizing and presenting sample data are presented, including tables, charts and distributions.
Regression analysis is a statistical technique for predicting a dependent variable based on one or more independent variables. Simple linear regression fits a straight line to the data to predict a continuous dependent variable (y) from a single independent variable (x). The output is an equation of the form y= b0 + b1x + ฮต, where b0 is the y-intercept, b1 is the slope, and ฮต is the error. Multiple linear regression extends this to include more than one independent variable. Regression analysis calculates the "best fit" line that minimizes the residuals, or differences between predicted and observed y values.
The document provides an overview of multiple linear regression (MLR). MLR allows predicting a dependent variable from multiple independent variables. It extends simple linear regression by incorporating additional predictors. Key points covered include: purposes of MLR for explanation and prediction; assumptions of the method; interpreting R-squared values; comparing unstandardized and standardized regression coefficients; and testing the statistical significance of predictors.
This document discusses various measures of central tendency including the mean, median, and mode. It provides definitions and formulas for calculating each measure. The mean is the sum of all values divided by the number of values and is the most widely used measure. The median is the middle value when data is arranged from lowest to highest. The mode is the value that occurs most frequently. Examples are given demonstrating how to calculate each measure for both individual values and grouped data.
The range is the simplest measure of variability, defined as the difference between the highest and lowest values in a data set. It is quick to calculate but does not provide a full picture of the data distribution and can be strongly influenced by outliers. Other measures of variability include the average deviation, which calculates the average amount each score deviates from the mean, and the interquartile range, which is less influenced by outliers than the range. The interquartile range only considers data between the first and third quartiles and ignores half the data points.
Here are the steps I would take to analyze this data using exploratory factor analysis:
1. Check assumptions
- Sample size of 300 is adequate
- Most correlations are between .3 and .8
2. Extract initial factors using principal axis factoring
- Kaiser's criterion suggests 4 factors with eigenvalues > 1
3. Rotate factors orthogonally using varimax rotation
- This will make the factor structure more interpretable
4. Interpret the factors based on which items have strong loadings
- Factor 1 relates to anxiety about learning SPSS
- Factor 2 relates to anxiety about using computers
- Factors 3 and 4 may reflect other aspects of statistics anxiety
5. Compute factor scores if desired to use in further
This document discusses exploratory factor analysis (EFA). EFA is used to identify underlying factors that explain the pattern of correlations within a set of observed variables. The document outlines the steps of EFA including testing assumptions, constructing a correlation matrix, determining the number of factors, rotating factors, and interpreting the factor loadings. It provides an example of running EFA on a dataset with 11 physical performance and anthropometric variables from 21 participants. The analysis extracts 3 factors that explain over 80% of the total variance.
Credit Card Fraudulent Transaction Detection Research PaperGarvit Burad
ย
Credit Card Fraudulent Transaction Detection Research Paper using Machine Learning technologies like Logistic Regression, Random Forrest, Feature Engineering and various techniques to deal with highly skewed dataset
This document defines and provides the formula for calculating mean deviation, which is a measure of variation that uses all the scores in a distribution. It is more reliable than range. Mean deviation is calculated by finding the absolute difference between each score and the mean, summing the absolute differences, and dividing by the number of observations. Two examples of calculating mean deviation for sets of data are provided, along with exercises asking students to find the mean deviation of additional data sets and define standard deviation.
The document discusses statistical significance, types of errors, and key statistical terms. It defines statistical significance as the strength of evidence needed to reject the null hypothesis, determined before conducting an experiment. There are two types of errors: type I errors reject a true null hypothesis, type II errors accept a false null hypothesis. Key terms discussed include population, parameter, sample, and statistic.
The document discusses the normal distribution, which produces a symmetrical bell-shaped curve. It has two key parameters - the mean and standard deviation. According to the empirical rule, about 68% of values in a normal distribution fall within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations. The normal distribution is commonly used to model naturally occurring phenomena that tend to cluster around an average value, such as heights or test scores.
Residuals represent variation in the data that cannot be explained by the model.
Residual plots useful for discovering patterns, outliers or misspecifications of the model. Systematic patterns discovered may suggest how to reformulate the model.
If the residuals exhibit no pattern, then this is a good indication that the model is appropriate for the particular data.
Understanding data type is an important concept in statistics, when you are designing an experiment, you want to know what type of data you are dealing with, that will decide what type of statistical analysis, visualizations and prediction algorithms could be used.
#data #data types #ai #machine learning #statistics #data science #data analytics #artificial intelligence
This document discusses multiple regression analysis. It begins by introducing multiple regression as an extension of simple linear regression that allows for modeling relationships between a response variable and multiple explanatory variables. It then covers topics such as examining variable distributions, building regression models, estimating model parameters, and assessing overall model fit and significance of individual predictors. An example demonstrates using multiple regression to build a model for predicting cable television subscribers based on advertising rates, station power, number of local families, and number of competing stations.
This document discusses the chi-square test and how to calculate expected frequencies. It provides an example of using a chi-square test to analyze observed vs expected frequencies of blood types in children with one parent of type A and one of type B. It also gives another example using a chi-square test to analyze the effectiveness of vaccination in preventing smallpox attacks using a 2x2 contingency table.
Logistic regression allows prediction of discrete outcomes from continuous and discrete variables. It addresses questions like discriminant analysis and multiple regression but without distributional assumptions. There are two main types: binary logistic regression for dichotomous dependent variables, and multinomial logistic regression for variables with more than two categories. Binary logistic regression expresses the log odds of the dependent variable as a function of the independent variables. Logistic regression assesses the effects of multiple explanatory variables on a binary outcome variable. It is useful when the dependent variable is non-parametric, there is no homoscedasticity, or normality and linearity are suspect.
This document provides an overview of descriptive statistics techniques for summarizing categorical and quantitative data. It discusses frequency distributions, measures of central tendency (mean, median, mode), measures of variability (range, variance, standard deviation), and methods for visualizing data through charts, graphs, and other displays. The goal of descriptive statistics is to organize and describe the characteristics of data through counts, averages, and other summaries.
Logistic regression is a statistical method used to predict a binary or categorical dependent variable from continuous or categorical independent variables. It generates coefficients to predict the log odds of an outcome being present or absent. The method assumes a linear relationship between the log odds and independent variables. Multinomial logistic regression extends this to dependent variables with more than two categories. An example analyzes high school student program choices using writing scores and socioeconomic status as predictors. The model fits significantly better than an intercept-only model. Increases in writing score decrease the log odds of general versus academic programs.
The document discusses the chi-square test of goodness of fit, which compares observed data to expected data to determine if an observed pattern fits a hypothesized or expected pattern. An example is provided of expected student enrollment numbers across 3 statistics professors' classes versus the actual observed enrollments. The chi-square test statistic is calculated by summing the squared differences between observed and expected values divided by the expected values. The calculated chi-square value is then compared to a critical value to determine if the null hypothesis that there is no difference between observed and expected can be rejected. In this example, the calculated chi-square value exceeds the critical value, so the null hypothesis is rejected.
This document provides an overview of chi-square tests, including chi-square goodness of fit and chi-square test of independence. It uses examples from a hypothetical New York City mayoral election poll to demonstrate how to perform each test. The chi-square goodness of fit test determines if the distribution of proportions in a sample fits the expected distribution. The chi-square test of independence determines if there is a relationship between two categorical variables, like voter gender and candidate preference. Both tests use a chi-square calculation and degrees of freedom to obtain a p-value, and Cramรฉr's V can estimate effect size.
08 test of hypothesis large sample.pptPooja Sakhla
ย
The document discusses quantitative methods and hypothesis testing. It covers key concepts like null and alternative hypotheses, types of hypothesis tests, test statistics, rejection regions, and p-values. Examples are provided to illustrate hypothesis testing for population means, proportions, differences between means and proportions. The goal is to introduce the mechanics and general procedure of hypothesis testing and how to report results through p-values.
Experience Mazda Zoom Zoom Lifestyle and Culture by Visiting and joining the Official Mazda Community at http://www.MazdaCommunity.org for additional insight into the Zoom Zoom Lifestyle and special offers for Mazda Community Members. If you live in Arizona, check out CardinaleWay Mazda's eCommerce website at http://www.Cardinale-Way-Mazda.com
09 test of hypothesis small sample.pptPooja Sakhla
ย
The document provides an overview of quantitative methods for small sample inferences. It discusses the student's t-distribution and its properties for small samples from normal populations when the population variance is unknown. It covers small sample inferences about a single population mean, the difference between two population means for independent and paired samples, and inferences about a single population variance and comparing two population variances. Examples are provided to illustrate hypothesis testing techniques for each quantitative method.
Stat 130 chi-square goodnes-of-fit testAldrin Lozano
ย
- The chi-square goodness-of-fit test can be used to determine if a frequency distribution fits a specific pattern or theoretical distribution. It compares observed frequencies to expected frequencies.
- To perform the test, the chi-square statistic is calculated using the formula (O-E)^2/E, where O is the observed frequency and E is the expected frequency. This value is then compared to a critical value from the chi-square distribution based on the degrees of freedom.
- If the chi-square statistic exceeds the critical value, the null hypothesis that the observed and expected frequencies are the same is rejected, indicating a poor fit between the observed and expected distributions.
Null hypothesis for a chi-square goodness of fit testKen Plummer
ย
The document discusses how to write a null hypothesis for a chi-square goodness of fit test. It provides an example of a poll that surveyed voters in Connecticut on their party affiliation (Republican or Democrat). The expected distribution was 40% Republican and 60% Democrat. The null hypothesis is stated as: The party affiliation of Republican/Democrat occur at a .4/.6 probability in Connecticut.
This document describes how to conduct a chi-square goodness of fit test. The test involves:
1) Stating the null and alternative hypotheses. The null hypothesis specifies the expected probabilities, while the alternative is that at least one expected probability is incorrect.
2) Developing an analysis plan specifying the significance level and test to be used.
3) Analyzing sample data to calculate degrees of freedom, expected frequencies, the test statistic, and p-value.
4) Interpreting the results by comparing the p-value to the significance level and rejecting or failing to reject the null hypothesis. An example problem demonstrates applying the test to determine if observed outcomes match a casino's claimed probabilities.
The document discusses the chi-square test, which is used to determine if an observed frequency distribution differs from an expected theoretical distribution. It can be used as a test of independence to determine if two variables are associated, and as a test of goodness of fit to assess how well an expected distribution fits observed data. The steps of the chi-square test are outlined, including calculating the test statistic, determining degrees of freedom, and comparing the statistic to critical values to determine if the null hypothesis can be rejected. An example of a chi-square test of independence is shown to test if perceptions of fairness of performance evaluation methods are independent of each other.
This document provides information about statistical tests and data analysis presented by Dr. Muhammedirfan H. Momin. It discusses the different types of statistical data, such as qualitative vs quantitative and continuous vs discrete data. It also covers topics like sample data sets, frequency distributions, risk factors for diseases, hypothesis testing, and tests for comparing proportions and means. Specific statistical tests discussed include the z-test and how to calculate test statistics and compare them to critical values to determine statistical significance. Examples are provided to illustrate how to perform these tests to analyze differences between data sets.
Here are the steps to solve this problem:
1. State the hypotheses:
H0: ฮผ = 100
H1: ฮผ โ 100
2. The critical values are ยฑ1.96 (two-tailed test, ฮฑ=0.05)
3. Compute the test statistic:
z = (140 - 100)/15/โ40 = 20/15/2 = 4
4. The test statistic is in the critical region, so reject the null hypothesis.
5. There is strong evidence that the medication affected intelligence since the sample mean is much higher than the population mean.
This document provides an overview of hypothesis testing including:
- Defining null and alternative hypotheses
- Types of errors like Type I and Type II
- Test statistics and significance levels for comparing means, proportions, and standard deviations of one and two populations
- Examples are given for hypothesis tests on population means, proportions, and comparing two population means.
This document provides an overview of hypothesis testing in inferential statistics. It defines a hypothesis as a statement or assumption about relationships between variables or tentative explanations for events. There are two main types of hypotheses: the null hypothesis (H0), which is the default position that is tested, and the alternative hypothesis (Ha or H1). Steps in hypothesis testing include establishing the null and alternative hypotheses, selecting a suitable test of significance or test statistic based on sample characteristics, formulating a decision rule to either accept or reject the null hypothesis based on where the test statistic value falls, and understanding the potential for errors. Key criteria for constructing hypotheses and selecting appropriate statistical tests are also outlined.
The t-test is used to compare the means of two groups and has three main applications:
1) Compare a sample mean to a population mean.
2) Compare the means of two independent samples.
3) Compare the values of one sample at two different time points.
There are two main types: the independent-measures t-test for samples not matched, and the matched-pair t-test for samples in pairs. The t-test assumes normal distributions and equal variances between groups. Examples are provided to demonstrate hypothesis testing for each application.
Hypothesis is usually considered as the principal instrument in research and quality control. Its main function is to suggest new experiments and observations. In fact, many experiments are carried out with the deliberate object of testing hypothesis. Decision makers often face situations wherein they are interested in testing hypothesis on the basis of available information and then take decisions on the basis of such testing. In Six โSigma methodology, hypothesis testing is a tool of substance and used in analysis phase of the six sigma project so that improvement can be done in right direction
This document discusses methods, content, pedagogy, and partnerships related to geoscience education. It outlines the scientific method, examines what is covered in K-12 geoscience standards, and explores formative assessment and collaboration as proven teaching practices. It emphasizes that field experiences for teachers and cross-disciplinary partnerships are important for sustainable education reform.
1) The document discusses modifying Fick's law of diffusion using fractional derivatives to account for memory effects not captured by ordinary derivatives.
2) A scaling similarity approach is used to reduce the fractional PDE to an ODE to find an analytical solution.
3) The solution obtained is a function involving parameters determined from the invariance conditions imposed during the similarity transformation to maintain the form of the original PDE.
On generating functions of biorthogonal polynomialseSAT Journals
ย
Abstract In this paper, we have obtained some novel generating functions (both bilateral and mixed trilateral) involving modified bi-orthogonal polynomials ํํ+ํํผ+ํ ํฅ;ํ , by group-theoretic method. As particular cases, we obtain the corresponding results on generalized Laguerre polynomials. Key words: AMS-2000 Classification Code :33C45,33C47, Biorthogonal polynomials, Laguerre polynomials, generating functions.
Module1 flexibility-2-problems- rajesh sirSHAMJITH KM
ย
This document discusses the flexibility method for analyzing structures. It provides the definitions of flexibility and stiffness influence coefficients and describes how to develop flexibility matrices for truss, beam, and frame elements using the physical and energy approaches. It then shows how to assemble the total flexibility matrix of a structure and use it to analyze simple structures like plane trusses, continuous beams, and plane frames. The document includes an example problem of a two-member structure to illustrate the flexibility method steps, such as determining static indeterminacy, developing member and system flexibility matrices, evaluating joint displacements and member end actions.
Module1 flexibility-2-problems- rajesh sirSHAMJITH KM
ย
This document discusses the flexibility method for analyzing structures. It provides the definitions of flexibility and stiffness influence coefficients and describes how to develop flexibility matrices for truss, beam, and frame elements using the physical and energy approaches. It then shows how to assemble the total flexibility matrix of a structure and use it to analyze simple structures like plane trusses, continuous beams, and plane frames. The document includes an example problem of a two-member structure to illustrate the flexibility method steps, such as determining static indeterminacy, developing member and system flexibility matrices, evaluating joint displacements and member end actions.
Using Semantics of Textbook Highlights to Predict Student Comprehension and K...Sergey Sosnovsky
ย
The document presents a framework for using the semantics of student textbook highlights to predict comprehension and knowledge retention. It uses semantic embeddings to encode highlighted sentences, compares them to questions, and uses the match scores in a model. It finds that augmenting a baseline model with highlighting features improves predictions of question accuracy, especially for held-out students. A semantic encoding of highlights performed better than a positional encoding. The approach works well across different levels of conceptual difficulty as defined by Bloom's taxonomy.
This document provides a student support material for Class XII Biology prepared by Kendriya Vidyalaya Sangathan, New Delhi. It aims to support students in their exam preparation and revision. The material includes lessons in point form, mind maps, flowcharts, diagrams, crossword puzzles, sample test questions from each chapter, and previous year board exam question papers. It is meant to supplement, not replace, the NCERT textbook. A team of experienced biology teachers coordinated by the Additional Commissioner (Academics) of KVS developed this material keeping the CBSE curriculum and question paper format in mind. The Commissioner of KVS hopes this material will help students perform well in their exams.
This document discusses the direct stiffness method for structural analysis. It begins by introducing the direct stiffness method and its key aspects, including using member stiffness matrices to express actions and displacements at both ends of each member. It then provides examples of applying the direct stiffness method to analyze a plane truss member and plane frame member. This involves deriving the member stiffness matrices in local coordinates, and transforming displacement, load, and stiffness matrices between local and global coordinate systems using rotation matrices.
This document discusses the direct stiffness method for structural analysis. It begins by introducing the direct stiffness method and its key aspects, including using member stiffness matrices to express actions and displacements at both ends of each member. It then provides examples of applying the direct stiffness method to analyze a plane truss member and plane frame member. This involves deriving the member stiffness matrices in local coordinates, and transforming displacement, load, and stiffness matrices between local and global coordinate systems using rotation matrices.
Root Locus, What is Root Locus? Concept of Root Locus
Angle and Magnitude Condition of Root Locus, Rules for Constructing Root Locus, Symmetricity of root locus, The starting and termination of root locus, Root Locus Points
The angle of Asymptotes, Centroid, Breakaway Point, Angle of Departure and Angle of Arrival, General Steps to Draw Root Locus, and Problems on Root Locus.
This document contains examples and explanations of key concepts in statistics and probability such as hypothesis testing, significance levels, test statistics, rejection regions, and one-tailed and two-tailed tests. It provides sample problems and questions for students to practice these statistical techniques, including one asking students to interpret the conclusions that can be drawn from a given hypothesis test at the 0.01 level of significance.
The importance of exploring the effect of individual behaviour change techniq...Tracy Epton
ย
A recent taxonomy has identified 93 different behaviour change techniques (BCTs). However, the taxonomy does not indicate which of these BCTs are effective. Moreover, as many behaviour change interventions include multiple BCTs it is difficult to determine which BCT is the โactive ingredientโ in successful interventions. One way of identifying effective BCTs is to locate all studies that compare an intervention that uses a chosen BCT with a comparison condition that is identical apart from the BCT of interest. Using meta-analysis it is possible to (a) quantify the effect of individual BCTs, (b) identify the circumstances under which the BCT is most effective, and (c) identify for whom the BCT is most effective. In addition to these practical issues this method also allows theory relating to the BCT to be tested and highlights gaps in the literature. The example of the BCT of goal setting will be used to illustrate this process
This document outlines a blended learning plan for a Grade 11 Math 2 class covering statistics and probability topics related to the normal distribution over one week. The plan includes anticipatory sets, instructional inputs, modeling examples, checking for understanding activities, and independent practice to illustrate a normal random variable and its characteristics, identify regions under the normal curve corresponding to different standard normal values, and convert between normal and standard normal variables. Daily activities include both synchronous in-person and asynchronous online components.
1. Experimental design refers to how experiments are structured in order to ensure validity and reliability of results.
2. There are several types of experimental designs including true experimental, quasi-experimental, pre-experimental, ex post facto, and factorial designs.
3. True experimental designs use random assignment and control/experimental groups to establish causation. Quasi-experimental designs lack random assignment so can only suggest relationships between variables. Ex post facto designs study pre-existing groups and cannot prove causation. Factorial designs study effects of multiple independent variables.
About testing the hypothesis of equality of two bernoulliAlexander Decker
ย
This document discusses testing the hypothesis of equality between two Bernoulli regression curves based on two independent samples. It presents a criterion for testing both simple and composite hypotheses about the equality of two regression functions. It establishes the limiting distribution of a statistic measuring the integral square deviation between two kernel-type estimators of the regression functions. It also investigates the consistency and asymptotic power of the test against some close alternatives.
The document is a lecture on partial fraction decomposition. It begins by defining partial fraction decomposition as expressing a rational function as a sum of simpler fractions. It then provides examples of decomposing fractions with non-repeated linear factors, a repeated linear factor, and a fraction with a quadratic factor. The examples show setting up the partial fraction decomposition equation and solving for the coefficients by considering the zeros of the factors.
These slide discuss the extending of the concept of correlation and show it can be used in prediction. The statistical test used is called regression. This is the process of using one variable to predict another when the two are correlated.
The document discusses identifying covariates when examining the effect of independent variables on dependent variables. It provides two examples: examining the effect of gender on handwriting scores while controlling for age, and examining the effect of listening to country music on truck driver drowsiness while controlling for years of trucking experience. A covariate is a variable that can affect the dependent variable, so it must be controlled for to isolate the effect of the independent variable.
The document discusses how covariates, or additional factors, can be used to better understand the effect of an independent variable on a dependent variable. It provides an example where handwriting neatness scores are analyzed by gender, but then age is added as a covariate to see if it influences the relationship between gender and scores. Controlling for the covariate of age through statistical methods allows researchers to determine if gender truly has an independent effect on scores or if age is a confounding factor.
Chi square test of independence (conceptual)CTLTLA
ย
This document discusses using the chi square test of independence to determine if two variables are independent of each other, such as whether college admissions decisions are made independently of an applicant's majority/minority status. Questions of independence are questions about bias or relationships between variables. If admissions are independent of status, the proportions of admitted majority and minority students will mirror the proportions in the local population; failure to be independent would indicate bias in admissions.
The document discusses central tendency and skewness. In Demo #1, it explains that the median is the best measure of central tendency for a positively skewed distribution because it is not influenced by outliers. In Demo #2, it states the mode is best for a multimodal distribution because it indicates the most frequent values. Demo #3 explains that if the mean is lower than the median, the distribution is negatively skewed.
Null hypothesis for pearson correlation (conceptual)CTLTLA
ย
The document discusses the null hypothesis for a Pearson correlation. It explains that the null hypothesis states that there is no statistically significant relationship between the independent and dependent variables. It provides examples of writing the null hypothesis for different problems, such as determining the relationship between student ACT scores and GPAs, and between depression scores and a sense of belonging. The null hypothesis would state that there is no significant relationship between the two variables in each case.
Null hypothesis for point biserial (conceptual)CTLTLA
ย
The document discusses null hypotheses for point-biserial correlations. It states that with hypothesis testing, a null hypothesis of no effect or relationship is set up. A point-biserial correlation can statistically test the relationship between a dichotomous variable and a continuous variable. It provides a template for writing a null hypothesis as "There is no statistically significant relationship between [variable 1] and [variable 2]." Two examples of null hypotheses are given for relationships between height and college graduation rates, and head circumference and political affiliation.
The document discusses the Pearson Product Moment Correlation, which measures the strength and direction of the linear relationship between two continuous variables. A correlation of +1 means a perfect positive relationship, -1 means a perfect negative relationship, and 0 means no relationship. The document provides examples of how different correlation values would appear when the variables are rank ordered from highest to lowest values.
This document describes how to perform a one-sample z-test to compare a sample proportion to a population proportion. It provides an example where a survey claims 90% of doctors recommend aspirin, and a sample of 100 doctors found 82% recommend aspirin. It outlines calculating the z-statistic to determine if this difference is statistically significant using a 95% confidence level. The z-statistic is calculated to be -1.08, which falls within the acceptable range so the null hypothesis that the population and sample proportions are the same is retained.
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
ย
(๐๐๐ ๐๐๐) (๐๐๐ฌ๐ฌ๐จ๐ง ๐)-๐๐ซ๐๐ฅ๐ข๐ฆ๐ฌ
๐๐ข๐ฌ๐๐ฎ๐ฌ๐ฌ ๐ญ๐ก๐ ๐๐๐ ๐๐ฎ๐ซ๐ซ๐ข๐๐ฎ๐ฅ๐ฎ๐ฆ ๐ข๐ง ๐ญ๐ก๐ ๐๐ก๐ข๐ฅ๐ข๐ฉ๐ฉ๐ข๐ง๐๐ฌ:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง ๐ญ๐ก๐ ๐๐๐ญ๐ฎ๐ซ๐ ๐๐ง๐ ๐๐๐จ๐ฉ๐ ๐จ๐ ๐๐ง ๐๐ง๐ญ๐ซ๐๐ฉ๐ซ๐๐ง๐๐ฎ๐ซ:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
ย
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
ย
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analyticsโ feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
เคนเคฟเคเคฆเฅ เคตเคฐเฅเคฃเคฎเคพเคฒเคพ เคชเฅเคชเฅเคเฅ, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, เคนเคฟเคเคฆเฅ เคธเฅเคตเคฐ, เคนเคฟเคเคฆเฅ เคตเฅเคฏเคเคเคจ, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
How to Make a Field Mandatory in Odoo 17Celine George
ย
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
ย
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
3. Question of Goodness of Fit
Questions of goodness of fit have become
increasingly important in modern statistics.
4. Question of Goodness of Fit
Questions of goodness of fit juxtapose complex
observed patterns against hypothesized or
previously observed patterns to test overall and
specific differences among them.
9. Observed Hypothesized Difference
If the difference is small then the FIT IS GOOD
Observed Hypothesized Difference
For example:
51% Females 50% Females 1%
14. Observed Hypothesized Difference
If the difference is BIG then the FIT IS NOT GOOD
Observed Hypothesized Difference
For example:
50% Females 22% Females 18%
16. Here is an example:
We want to know if a sample we have selected
has the national percentages of a certain ethnic
groups.
17. Here is an example:
We want to know if a sample we have selected
has the national percentages of a certain ethnic
groups.
2% of sample
is made of
members of
this ethnic
group
10% of the
population is
made of this
ethnic group
8% Difference
18. You will use certain statistical methods
to determine if the goodness of fit is
significant or not.
19. You will use certain statistical methods
to determine if the goodness of fit is
significant or not.
Here is an example:
20. You will use certain statistical methods
to determine if the goodness of fit is
significant or not.
Here is an example:
Problem โ The chair of a statistics department
suspects that some of her faculty are more
popular with students than others.
21. There are three sections of introductory stats
that are taught at the same time in the morning
by Professors Cauforek, Kerr, and Rector.
22. There are three sections of introductory stats
that are taught at the same time in the morning
by Professors Cauforek, Kerr, and Rector.
66 students are planning on enrolling in one of
the three classes.
23. What would you expect the number of enrollees
to be in each class if popularity were not an
issue?
24. Professor Cauforek Professor Kerr Professor Rector
22 22 22
What would you expect the number of enrollees
to be in each class if popularity were not an
issue?
25. Professor Cauforek Professor Kerr Professor Rector
22 22 22
What would you expect the number of enrollees
to be in each class if popularity were not an
issue?
This is our expected value.
27. Now letโs see what was observed.
The number who enroll for each class was:
28. Now letโs see what was observed.
The number who enroll for each class was:
Professor Cauforek Professor Kerr Professor Rector
31 25 10
29. We will test the degree to which the observed
data...
30. We will test the degree to which the observed
data...
Professor Cauforek Professor Kerr Professor Rector
31 25 10
31. We will test the degree to which the observed
data...
โฆfits the expected enrollments.
Professor Cauforek Professor Kerr Professor Rector
31 25 10
32. We will test the degree to which the observed
data...
โฆfits the expected enrollments.
Professor Cauforek Professor Kerr Professor Rector
31 25 10
Professor Cauforek Professor Kerr Professor Rector
22 22 22
51. Here is the null-hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professorsโ classes.
52. Now we will compute the ๐ฅ2
value and compare
it with the ๐ฅ2
critical value.
53. Now we will compute the ๐ฅ2
value and compare
it with the ๐ฅ2
critical value.
โข If the value exceeds the critical value, then
we will reject the null-hypothesis.
54. Now we will compute the ๐ฅ2
value and compare
it with the ๐ฅ2
critical value.
โข If the value exceeds the critical value, then
we will reject the null-hypothesis.
โข If the value DOES NOT exceed the critical
value, then we will fail to reject the null-
hypothesis.
85. As a contrasting example note what the ๐ฅ2
value
would be if the observed and expected values
were more similar:
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 24 22 20
93. So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).
94. So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
95. So the moral of the story is that the closer the
expected and observed values are to one
another, the smaller the Chi-square value or the
greater the goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
๐ฅ2
= ๐๐. ๐
96. On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).
97. On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
98. On the other hand, the farther the expected and
observed values are from one another the
smaller the Chi-square value or the greater the
goodness of fit (as seen below).
Professor Cauforek Professor Kerr Professor Rector
Expected 22 22 22
Observed 31 25 10
๐ฅ2
= ๐๐. ๐
99. Now we determine if a ๐ฅ2
of 10.6 exceeds the
critical ๐ฅ2
for terms.
100. To calculate the ๐ฅ2
critical we first must
determine the degrees of freedom as well as set
the probability level.
101. To calculate the ๐ฅ2
critical we first must
determine the degrees of freedom as well as set
the probability level.
The probability or alpha level means the
probability of a type 1 error we are willing to live
with (i.e., this is the probability of being wrong
when we reject the null hypothesis).
102. To calculate the ๐ฅ2
critical we first must
determine the degrees of freedom as well as set
the probability level.
The probability or alpha level means the
probability of a type 1 error we are willing to live
with (i.e., this is the probability of being wrong
when we reject the null hypothesis). Generally
this value is 0.5 which is like saying we are
willing to be wrong 5 out of 100 times (0.05)
before we will reject the null-hypothesis.
103. Degrees of Freedom are calculated by taking the
number of groups and subtracting them by 1.
(Three groups minus 1 = 2)
104. We now have all of the information we need to
determine the critical ๐ฅ2
.
105. We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
106. We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
โฆ โฆ โฆ โฆ
107. We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
โฆ โฆ โฆ โฆ
108. We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
โฆ โฆ โฆ โฆ
109. We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
โฆ โฆ โฆ โฆ
Where these two values
intersect in the table we
find the critical ๐ฅ2
.
110. df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
โฆ โฆ โฆ โฆ
We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
Where these two values
intersect in the table we
find the critical ๐ฅ2
.
111. We now have all of the information we need to
determine the critical ๐ฅ2
.
We go to the Chi-Square Distribution Table and
locate the degrees of freedom.
And then we locate the probability or alpha level:
df 0.100 0.050 0.025
1 2.71 3.84 5.02
2 4.61 5.99 7.38
3 6.25 7.82 9.35
4 7.78 9.49 11.14
5 9.24 11.07 12.83
6 10.64 12.59 14.45
7 12.02 14.07 16.10
8 13.36 15.51 17.54
9 14.68 16.92 19.20
โฆ โฆ โฆ โฆ
Where these two values
intersect in the table we
find the critical ๐ฅ2
.
112. Since the chi-square goodness of fit value (10.6)
exceeds the critical ๐ฅ2
(5.99) we will reject the
null hypothesis:
113. Since the chi-square goodness of fit value (10.6)
exceeds the critical ๐ฅ2
(5.99) we will reject the
null hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professorsโ classes.
114. Since the chi-square goodness of fit value (10.6)
exceeds the critical ๐ฅ2
(5.99) we will reject the
null hypothesis:
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professorsโ classes.
115. Since the chi-square goodness of fit value (10.6)
exceeds the critical ๐ฅ2
(5.99) we will reject the
null hypothesis:
There actually is a significant difference.
There is no significant difference between the
expected and the observed number of students
enrolled in three stats professorsโ classes.
117. In summary,
Questions of goodness of fit juxtapose observed
patterns against hypothesized to test overall and
specific differences among them.
118. In summary,
Questions of goodness of fit juxtapose observed
patterns against hypothesized to test overall and
specific differences among them.
Observed Hypothesized Difference