This document contains the final assessment for a Quantitative Methods and Data Analysis module. It includes 5 tasks:
Task 1 defines variables and their properties in an SPSS data view.
Task 2 analyzes consumer characteristics at a retail center using cross-tabulation and bar charts, finding most consumers are married and employed.
Task 3 examines consumer spending based on characteristics, finding outliers in some groups and differences between occupations and appreciation levels.
Task 4 analyzes the distribution of monthly spending and distance traveled, finding neither are normally distributed using various tests.
Task 5 proposes using a t-test to examine differences in age and gender, stating the null and alternative hypotheses.
B409 W11 Sas Collaborative Stats Guide V4.2marshalkalra
This document provides an overview of numerical summaries and variation within data. It defines key terms like mean, median, mode, range, standard deviation, and variance. It also discusses sources of variation within data like process inputs and conditions versus random temporary events. The document demonstrates how to use SAS software to analyze a cars dataset and create reports and bar charts to describe the data and identify trends and variation.
This document provides an overview of multinomial logistic regression. It discusses how multinomial logistic regression is used when the dependent variable has more than two nominal categories. An example is presented where voting behavior is predicted based on age, gender, economic beliefs, and religious beliefs, with the dependent variable having four categories for different candidates. The document walks through setting up and interpreting the results of a multinomial logistic regression analysis in SPSS for this example. Key results shown include the regression coefficients, odds ratios, goodness of fit statistics, and classification accuracy for each category of the dependent variable.
This document discusses the statistical analysis carried out on survey data to estimate the willingness to pay (WTP) for improved water quality using multilevel modeling (MLM). It describes:
1) Conducting a conventional logistic regression analysis on the single-bound dichotomous choice (SBDC) responses before using MLM to account for the hierarchical structure of the data.
2) Estimating WTP from the double-bound dichotomous choice (DBDC) data using MLM, which models the natural hierarchy in responses nested within individuals.
3) Estimating the incidence of benefits across income groups using the WTP estimates from a linear regression of stated WTP responses. This found WTP generally
Graphical Analysis of Simulated Financial Data Using RIRJET Journal
This document summarizes research analyzing simulated financial data using the R programming language. The researchers developed tools to calculate basic financial ratios using publicly available company data to help investors determine a company's true market value. They performed DuPont analysis on real data from a London company and ANOVA analysis on simulated data generated in large amounts. The document describes the analytical techniques used, including the assumptions and methodology for two-way ANOVA testing. The researchers concluded their study of simulated data and DuPont analysis helped develop an understanding of how financial ratios behave in both efficient and inefficient markets.
This document discusses identifying the determinants of stock price movements. It summarizes previous literature that found stock prices were too volatile to be explained solely by changes in expected future dividends. The authors argue that there is an inability to distinguish whether expectations of future dividend growth or future excess returns are the primary driver of stock price movements. They show that stock prices exhibit long-run persistence, but neither dividend growth nor excess returns exhibit low-frequency movements. As a result, the data cannot distinguish between models where one or the other is the main determinant of stock prices. The relative importance assigned to dividends versus excess returns in explaining stock price volatility depends on the assumptions made about which variable is stationary.
1) The document analyzes the relationship between China's Purchasing Managers Index (PMI) and two U.S. stock market indexes, the S&P 500 and Dow Jones Industrial Average, over two time periods: 2006-2015 and 2010-2015.
2) For the period 2006-2015, there was no significant correlation found between China's PMI and the two U.S. indexes.
3) For the period 2010-2015, there were significant negative correlations found - as China's PMI increased, the two U.S. indexes tended to decline, and vice versa. The analysis estimates specific impacts of a one-point change in China's PMI.
This document provides summaries of statistical analysis techniques covered in chapters 16-21 of a textbook. It summarizes the key steps and purposes of chi square analysis (testing expected vs. actual distributions), one-way ANOVA (comparing means across groups), regression analysis (determining relationships between dependent and independent variables), discriminant analysis (differentiating between groups based on parameters), factor analysis (reducing correlated variables into factors), and cluster analysis (forming homogeneous groups of objects based on interdependence). The summaries highlight when each technique is used, key assumptions, and how to interpret results.
The document discusses cluster analysis techniques for market segmentation. Cluster analysis groups similar objects together to identify patterns in data. Hierarchical and non-hierarchical clustering procedures are described. As an example, a paint company conducted a survey with statements to segment audiences. Responses were analyzed using hierarchical clustering to determine the optimal number of clusters, followed by k-means clustering to classify respondents. This resulted in 4 clusters with different characteristics that provide insights into the target audiences.
B409 W11 Sas Collaborative Stats Guide V4.2marshalkalra
This document provides an overview of numerical summaries and variation within data. It defines key terms like mean, median, mode, range, standard deviation, and variance. It also discusses sources of variation within data like process inputs and conditions versus random temporary events. The document demonstrates how to use SAS software to analyze a cars dataset and create reports and bar charts to describe the data and identify trends and variation.
This document provides an overview of multinomial logistic regression. It discusses how multinomial logistic regression is used when the dependent variable has more than two nominal categories. An example is presented where voting behavior is predicted based on age, gender, economic beliefs, and religious beliefs, with the dependent variable having four categories for different candidates. The document walks through setting up and interpreting the results of a multinomial logistic regression analysis in SPSS for this example. Key results shown include the regression coefficients, odds ratios, goodness of fit statistics, and classification accuracy for each category of the dependent variable.
This document discusses the statistical analysis carried out on survey data to estimate the willingness to pay (WTP) for improved water quality using multilevel modeling (MLM). It describes:
1) Conducting a conventional logistic regression analysis on the single-bound dichotomous choice (SBDC) responses before using MLM to account for the hierarchical structure of the data.
2) Estimating WTP from the double-bound dichotomous choice (DBDC) data using MLM, which models the natural hierarchy in responses nested within individuals.
3) Estimating the incidence of benefits across income groups using the WTP estimates from a linear regression of stated WTP responses. This found WTP generally
Graphical Analysis of Simulated Financial Data Using RIRJET Journal
This document summarizes research analyzing simulated financial data using the R programming language. The researchers developed tools to calculate basic financial ratios using publicly available company data to help investors determine a company's true market value. They performed DuPont analysis on real data from a London company and ANOVA analysis on simulated data generated in large amounts. The document describes the analytical techniques used, including the assumptions and methodology for two-way ANOVA testing. The researchers concluded their study of simulated data and DuPont analysis helped develop an understanding of how financial ratios behave in both efficient and inefficient markets.
This document discusses identifying the determinants of stock price movements. It summarizes previous literature that found stock prices were too volatile to be explained solely by changes in expected future dividends. The authors argue that there is an inability to distinguish whether expectations of future dividend growth or future excess returns are the primary driver of stock price movements. They show that stock prices exhibit long-run persistence, but neither dividend growth nor excess returns exhibit low-frequency movements. As a result, the data cannot distinguish between models where one or the other is the main determinant of stock prices. The relative importance assigned to dividends versus excess returns in explaining stock price volatility depends on the assumptions made about which variable is stationary.
1) The document analyzes the relationship between China's Purchasing Managers Index (PMI) and two U.S. stock market indexes, the S&P 500 and Dow Jones Industrial Average, over two time periods: 2006-2015 and 2010-2015.
2) For the period 2006-2015, there was no significant correlation found between China's PMI and the two U.S. indexes.
3) For the period 2010-2015, there were significant negative correlations found - as China's PMI increased, the two U.S. indexes tended to decline, and vice versa. The analysis estimates specific impacts of a one-point change in China's PMI.
This document provides summaries of statistical analysis techniques covered in chapters 16-21 of a textbook. It summarizes the key steps and purposes of chi square analysis (testing expected vs. actual distributions), one-way ANOVA (comparing means across groups), regression analysis (determining relationships between dependent and independent variables), discriminant analysis (differentiating between groups based on parameters), factor analysis (reducing correlated variables into factors), and cluster analysis (forming homogeneous groups of objects based on interdependence). The summaries highlight when each technique is used, key assumptions, and how to interpret results.
The document discusses cluster analysis techniques for market segmentation. Cluster analysis groups similar objects together to identify patterns in data. Hierarchical and non-hierarchical clustering procedures are described. As an example, a paint company conducted a survey with statements to segment audiences. Responses were analyzed using hierarchical clustering to determine the optimal number of clusters, followed by k-means clustering to classify respondents. This resulted in 4 clusters with different characteristics that provide insights into the target audiences.
This document summarizes an analysis of economic and salary data for Iowa, with a focus on Dubuque. It uses statistical analysis and data visualization tools to:
1) Examine trends in median salaries over time for different occupations and regions in Iowa, finding some occupations and areas have had declining salaries recently.
2) Categorize occupations into professional, manual labor, and personal services to view high-level trends and compare median incomes between categories.
3) Analyze relationships between different parts of the salary distribution to understand how salaries may be changing for high- and low-income workers within occupations.
4) Create interactive visualizations to explore changes in employment levels and salaries for various occupations over multiple
This document provides an overview of key concepts for describing numerical data, including measures of central tendency (such as the mean, median, mode, weighted mean, and geometric mean) and measures of dispersion (such as the range, mean deviation, variance and standard deviation). It defines each measure and provides examples to demonstrate how to calculate and interpret the measures. The learning objectives cover explaining the concept of central tendency, identifying and computing various measures of central tendency and dispersion, and applying the measures to analyze datasets.
This document presents a statistical analysis of survey data from 65 respondents. The analysis includes descriptive statistics on respondent demographics like age, gender, work category, education level, and more. Inferential statistics like t-tests, ANOVA, correlation, and regression are used to compare motivation scores across these demographic groups and examine the relationship between motivation and experience. The results show no significant differences in motivation between groups defined by gender, organization type, marital status, age, or work category. Regression analysis finds that demographic variables together explain only 9% of variance in motivation.
This chapter discusses two-sample hypothesis tests for comparing means and proportions between two independent populations or between paired/dependent samples. It provides examples of hypothesis tests to compare the means of two independent samples using the z-test if populations are normal and sample sizes are large, or the t-test if populations are normal but sample sizes are small. Tests are also shown to compare proportions between two independent populations using the z-test, and to compare means between paired samples using the t-test.
The document provides an overview of multiple regression and logistic regression analyses conducted on gender inequality data. For multiple regression, five factors were examined as predictors of the gender inequality index. The analysis found the factors of maternal mortality ratio, adolescent birth rate, and labor force participation rate to be statistically significant predictors. For logistic regression, employment rate was predicted based on gender, age, country, and year, with the full model accounting for 37.7% of variability in employment rate.
This chapter discusses important discrete probability distributions used in business statistics. It introduces discrete random variables and their probability distributions. It defines the binomial distribution and explains how to calculate probabilities using the binomial formula. Examples are provided to demonstrate calculating the mean, variance, and covariance of discrete random variables, as well as the expected value and risk of investment portfolios. Counting techniques like combinations are also discussed for calculating binomial probabilities.
This document outlines the steps for hypothesis testing, including:
1. Defining the null and alternative hypotheses (H0 and H1). H0 is presumed true while H1 has the burden of proof.
2. Conducting a 5-step hypothesis testing procedure: state hypotheses, select significance level, select test statistic, formulate decision rule, make decision and interpret.
3. Distinguishing between one-tailed and two-tailed tests. Keywords in the problem statement determine if it is left-tailed, right-tailed, or two-tailed.
4. Examples are provided for testing hypotheses about population means when the population standard deviation is known or unknown, and for testing hypotheses about
This document provides information on statistics and probability sampling methods. It defines statistics as the science of collecting, organizing, summarizing, analyzing, and interpreting data. It describes the four main components of statistics as data collection, presentation, analysis, and interpretation. It also lists seven key characteristics of statistics. The document then discusses probability concepts like probability, experiments, outcomes, and definitions. It provides an example to calculate probabilities. Finally, it describes various probability sampling methods like simple random sampling, stratified random sampling, systematic sampling, cluster sampling, and multi-stage sampling as well as non-probability sampling methods like judgment sampling, convenience sampling, and quota sampling.
This document discusses the normal distribution and other continuous probability distributions. It begins by listing the learning objectives, which are to compute probabilities from the normal, uniform, exponential, and binomial distributions. It then defines continuous random variables and describes key properties of the normal distribution, including its bell shape, equal mean, median and mode, and symmetry. Several examples are provided to illustrate how to compute probabilities using the normal distribution and standardized normal table. The empirical rules for the normal distribution are also discussed.
Case Study on Placement Solution to AS Business School (Biswadeep Ghosh Hazra...Biswadeep Ghosh Hazra
We had to make a relevant placement strategy for a B-school taking into account a multitude of factors. The data set includes secondary and higher secondary school percentages and specialisations of unique students from the past. It also includes their UG specialisation, work experience and the salary offered to the placed students.
The strategy was arrived at by performing relevant Data Analysis on the given Dataset
Team Name & Details- "Art of War" comprised of myself, Devark Chauhan and Shivam Arora
This document introduces key concepts in statistics. It discusses descriptive statistics, which organizes and summarizes data, and inferential statistics, which makes estimates about populations based on samples. Variables can be qualitative, involving categories, or quantitative, involving numbers. Quantitative variables can be discrete, with separate values, or continuous, able to assume any value. Variables are also classified by their level of measurement - nominal involves categories, ordinal involves ranking, interval allows comparing differences, and ratio has a true zero point. Statistics is used across many fields to help make effective decisions based on numerical data.
This document presents a simultaneous equation system analyzing the labor market. It acknowledges that some economic variables are jointly determined rather than having a strictly unidirectional relationship. The system includes two equations: a labor supply equation relating hours to average wage and other factors, and a labor demand equation relating quantity demanded to average wage and factor costs. These equations represent the behavior of workers and employers in aggregate and are solved in equilibrium when quantity supplied equals quantity demanded. Estimating either equation via OLS would be inconsistent since the wage is correlated with the error term. The system can be solved into reduced form equations showing that outcomes depend on exogenous variables and structural errors. Separate explanatory factors are needed in each equation to allow unique identification of parameters.
This document provides an analysis of Procter & Gamble (P&G) and Coca-Cola's entry into the Russian market following the collapse of the Soviet Union in the early 1990s. It discusses the benefits and risks of the Russian market at that time. Both P&G and Coca-Cola recognized the large potential of the Russian market but also faced significant political and economic risks and a lack of infrastructure. P&G established a joint venture and distribution network to overcome risks, while Coca-Cola used foreign direct investment and alliances with local companies to gain government support and negotiate tax exemptions as it worked to establish production and distribution. The document examines the strategies and structures used by each company to successfully enter
This dissertation explores the potential for mass customization in the online fashion industry from both producer and consumer perspectives. It analyzes the advantages of technology, threats from fast fashion, and barriers to consumer adoption of customization. Through a literature review and empirical study including an industry expert interview and consumer survey, the dissertation aims to determine consumers' willingness to customize and buy clothes online and the conditions for a successful customization strategy. Key findings include an assessment of opportunities and risks for startups pursuing customization and a new consumer-focused framework for customization as a business strategy.
The document is a 3000 word assessment on Nespresso's brand identity. It analyzes Nespresso using Keller's Customer-Based Brand Equity (CBBE) model, discussing each element: (1) Salience - Nespresso has strong global presence and awareness, especially among high-income consumers; (2) Performance - It offers superior quality coffee machines, capsules and service; (3) Imagery - The brand portrays an upmarket, sophisticated lifestyle; (4) Judgments - It is seen as reliable and innovative; (5) Feelings - The brand evokes prestige and satisfaction; (6) Resonance - Members feel a sense of belonging to an exclusive club. Overall
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms for those who already suffer from conditions like depression and anxiety.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness, happiness and focus.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Mohammed Ziyad T. is seeking a suitable position in sales and marketing to utilize his 14 years of experience in those fields in the UAE. He has held several roles with increasing responsibility, including national sales manager, area sales manager, and sales supervisor. He has experience in market research, sales strategy, budgeting, and people management. He has a post-graduate degree in business administration with a specialization in marketing and sales.
GEOSEIS provides surveying, permitting, inspection, and GIS analysis services to the oil and gas industry. It offers seismic surveying, construction surveying, seismic permitting, and GIS analysis including spatial analysis techniques. Based in Cluj-Napoca, Romania, GEOSEIS works on projects around the world using up-to-date technology and experienced staff.
RH Summit 2015 - Using RH Management Tools In A Hybrid CloudMatthew Mariani
Using Red Hat systems management tools, organizations can manage hybrid cloud environments with both on-premise and public cloud resources. With Red Hat Satellite, organizations can extend configuration management begun on-premise to public cloud instances. Red Hat CloudForms provides additional capabilities for hybrid cloud management including governance, orchestration, and metering across platforms. Leveraging existing RHEL subscriptions via Red Hat Cloud Access and using Red Hat Certified Cloud Service Providers allows for consistent management between on-premise and public cloud assets.
This document summarizes an analysis of economic and salary data for Iowa, with a focus on Dubuque. It uses statistical analysis and data visualization tools to:
1) Examine trends in median salaries over time for different occupations and regions in Iowa, finding some occupations and areas have had declining salaries recently.
2) Categorize occupations into professional, manual labor, and personal services to view high-level trends and compare median incomes between categories.
3) Analyze relationships between different parts of the salary distribution to understand how salaries may be changing for high- and low-income workers within occupations.
4) Create interactive visualizations to explore changes in employment levels and salaries for various occupations over multiple
This document provides an overview of key concepts for describing numerical data, including measures of central tendency (such as the mean, median, mode, weighted mean, and geometric mean) and measures of dispersion (such as the range, mean deviation, variance and standard deviation). It defines each measure and provides examples to demonstrate how to calculate and interpret the measures. The learning objectives cover explaining the concept of central tendency, identifying and computing various measures of central tendency and dispersion, and applying the measures to analyze datasets.
This document presents a statistical analysis of survey data from 65 respondents. The analysis includes descriptive statistics on respondent demographics like age, gender, work category, education level, and more. Inferential statistics like t-tests, ANOVA, correlation, and regression are used to compare motivation scores across these demographic groups and examine the relationship between motivation and experience. The results show no significant differences in motivation between groups defined by gender, organization type, marital status, age, or work category. Regression analysis finds that demographic variables together explain only 9% of variance in motivation.
This chapter discusses two-sample hypothesis tests for comparing means and proportions between two independent populations or between paired/dependent samples. It provides examples of hypothesis tests to compare the means of two independent samples using the z-test if populations are normal and sample sizes are large, or the t-test if populations are normal but sample sizes are small. Tests are also shown to compare proportions between two independent populations using the z-test, and to compare means between paired samples using the t-test.
The document provides an overview of multiple regression and logistic regression analyses conducted on gender inequality data. For multiple regression, five factors were examined as predictors of the gender inequality index. The analysis found the factors of maternal mortality ratio, adolescent birth rate, and labor force participation rate to be statistically significant predictors. For logistic regression, employment rate was predicted based on gender, age, country, and year, with the full model accounting for 37.7% of variability in employment rate.
This chapter discusses important discrete probability distributions used in business statistics. It introduces discrete random variables and their probability distributions. It defines the binomial distribution and explains how to calculate probabilities using the binomial formula. Examples are provided to demonstrate calculating the mean, variance, and covariance of discrete random variables, as well as the expected value and risk of investment portfolios. Counting techniques like combinations are also discussed for calculating binomial probabilities.
This document outlines the steps for hypothesis testing, including:
1. Defining the null and alternative hypotheses (H0 and H1). H0 is presumed true while H1 has the burden of proof.
2. Conducting a 5-step hypothesis testing procedure: state hypotheses, select significance level, select test statistic, formulate decision rule, make decision and interpret.
3. Distinguishing between one-tailed and two-tailed tests. Keywords in the problem statement determine if it is left-tailed, right-tailed, or two-tailed.
4. Examples are provided for testing hypotheses about population means when the population standard deviation is known or unknown, and for testing hypotheses about
This document provides information on statistics and probability sampling methods. It defines statistics as the science of collecting, organizing, summarizing, analyzing, and interpreting data. It describes the four main components of statistics as data collection, presentation, analysis, and interpretation. It also lists seven key characteristics of statistics. The document then discusses probability concepts like probability, experiments, outcomes, and definitions. It provides an example to calculate probabilities. Finally, it describes various probability sampling methods like simple random sampling, stratified random sampling, systematic sampling, cluster sampling, and multi-stage sampling as well as non-probability sampling methods like judgment sampling, convenience sampling, and quota sampling.
This document discusses the normal distribution and other continuous probability distributions. It begins by listing the learning objectives, which are to compute probabilities from the normal, uniform, exponential, and binomial distributions. It then defines continuous random variables and describes key properties of the normal distribution, including its bell shape, equal mean, median and mode, and symmetry. Several examples are provided to illustrate how to compute probabilities using the normal distribution and standardized normal table. The empirical rules for the normal distribution are also discussed.
Case Study on Placement Solution to AS Business School (Biswadeep Ghosh Hazra...Biswadeep Ghosh Hazra
We had to make a relevant placement strategy for a B-school taking into account a multitude of factors. The data set includes secondary and higher secondary school percentages and specialisations of unique students from the past. It also includes their UG specialisation, work experience and the salary offered to the placed students.
The strategy was arrived at by performing relevant Data Analysis on the given Dataset
Team Name & Details- "Art of War" comprised of myself, Devark Chauhan and Shivam Arora
This document introduces key concepts in statistics. It discusses descriptive statistics, which organizes and summarizes data, and inferential statistics, which makes estimates about populations based on samples. Variables can be qualitative, involving categories, or quantitative, involving numbers. Quantitative variables can be discrete, with separate values, or continuous, able to assume any value. Variables are also classified by their level of measurement - nominal involves categories, ordinal involves ranking, interval allows comparing differences, and ratio has a true zero point. Statistics is used across many fields to help make effective decisions based on numerical data.
This document presents a simultaneous equation system analyzing the labor market. It acknowledges that some economic variables are jointly determined rather than having a strictly unidirectional relationship. The system includes two equations: a labor supply equation relating hours to average wage and other factors, and a labor demand equation relating quantity demanded to average wage and factor costs. These equations represent the behavior of workers and employers in aggregate and are solved in equilibrium when quantity supplied equals quantity demanded. Estimating either equation via OLS would be inconsistent since the wage is correlated with the error term. The system can be solved into reduced form equations showing that outcomes depend on exogenous variables and structural errors. Separate explanatory factors are needed in each equation to allow unique identification of parameters.
This document provides an analysis of Procter & Gamble (P&G) and Coca-Cola's entry into the Russian market following the collapse of the Soviet Union in the early 1990s. It discusses the benefits and risks of the Russian market at that time. Both P&G and Coca-Cola recognized the large potential of the Russian market but also faced significant political and economic risks and a lack of infrastructure. P&G established a joint venture and distribution network to overcome risks, while Coca-Cola used foreign direct investment and alliances with local companies to gain government support and negotiate tax exemptions as it worked to establish production and distribution. The document examines the strategies and structures used by each company to successfully enter
This dissertation explores the potential for mass customization in the online fashion industry from both producer and consumer perspectives. It analyzes the advantages of technology, threats from fast fashion, and barriers to consumer adoption of customization. Through a literature review and empirical study including an industry expert interview and consumer survey, the dissertation aims to determine consumers' willingness to customize and buy clothes online and the conditions for a successful customization strategy. Key findings include an assessment of opportunities and risks for startups pursuing customization and a new consumer-focused framework for customization as a business strategy.
The document is a 3000 word assessment on Nespresso's brand identity. It analyzes Nespresso using Keller's Customer-Based Brand Equity (CBBE) model, discussing each element: (1) Salience - Nespresso has strong global presence and awareness, especially among high-income consumers; (2) Performance - It offers superior quality coffee machines, capsules and service; (3) Imagery - The brand portrays an upmarket, sophisticated lifestyle; (4) Judgments - It is seen as reliable and innovative; (5) Feelings - The brand evokes prestige and satisfaction; (6) Resonance - Members feel a sense of belonging to an exclusive club. Overall
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms for those who already suffer from conditions like depression and anxiety.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness, happiness and focus.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Mohammed Ziyad T. is seeking a suitable position in sales and marketing to utilize his 14 years of experience in those fields in the UAE. He has held several roles with increasing responsibility, including national sales manager, area sales manager, and sales supervisor. He has experience in market research, sales strategy, budgeting, and people management. He has a post-graduate degree in business administration with a specialization in marketing and sales.
GEOSEIS provides surveying, permitting, inspection, and GIS analysis services to the oil and gas industry. It offers seismic surveying, construction surveying, seismic permitting, and GIS analysis including spatial analysis techniques. Based in Cluj-Napoca, Romania, GEOSEIS works on projects around the world using up-to-date technology and experienced staff.
RH Summit 2015 - Using RH Management Tools In A Hybrid CloudMatthew Mariani
Using Red Hat systems management tools, organizations can manage hybrid cloud environments with both on-premise and public cloud resources. With Red Hat Satellite, organizations can extend configuration management begun on-premise to public cloud instances. Red Hat CloudForms provides additional capabilities for hybrid cloud management including governance, orchestration, and metering across platforms. Leveraging existing RHEL subscriptions via Red Hat Cloud Access and using Red Hat Certified Cloud Service Providers allows for consistent management between on-premise and public cloud assets.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Открытие itSMF Ukraine:
- О Форуме, как о форме взаимодействия профессионалов, Роман Колос
- Форум в Росcии, Илья Хает/Илья Савичев
- От слов к делу, Илья Савичев
- Анонс мероприятий ФОРУМА на 2014 г., Василий Владимиров
Despertar a consciência é um processo que para alguns pode surgir de forma natural, porém, para outros ele precisa de uma motivação externa.
Geralmente, no período de final de ano as pessoas tendem a estarem mais reflexivas sobre seus atos realizados ao longo do ano e propensas a repensarem sobre seus sonhos, planos e metas para o novo ano.
Pensando nisso é que surgiu a inspiração para criarmos o ebook "Desperte para 2016". Ele é o nosso presente de final de ano para vocês nossos leitores, que tanto nos acompanharam e trouxeram experiências positivas para nossa vida como autores.
Este ebook é fruto de tudo o que vivemos este ano, que para nós foi de intensas emoções, transformações e experiências, além pensamentos e dicas para que você viva a vida cada vez mais intensamente em 2016, despertando para a vida.
Compartilhe com a família, os amigos e com quem mais você gosta e acredita que será importante ter ao seu lado em 2016, aproveitando e desfrutando do melhor que a vida pode oferece
Thesis Writing Tips for College Studentsjayjames12
This presentation is made to cater all the questions that usually bother students about thesis writing. There are some tips in this presentation which would help you identify the areas where improvements can be made
http://www.papersonthedot.co.uk/dissertation/
http://www.papersonthedot.co.uk/thesis/
BUS308 – Week 1 Lecture 2 Describing Data Expected Out.docxcurwenmichaela
BUS308 – Week 1 Lecture 2
Describing Data
Expected Outcomes
After reading this lecture, the student should be familiar with:
1. Basic descriptive statistics for data location
2. Basic descriptive statistics for data consistency
3. Basic descriptive statistics for data position
4. Basic approaches for describing likelihood
5. Difference between descriptive and inferential statistics
What this lecture covers
This lecture focuses on describing data and how these descriptions can be used in an
analysis. It also introduces and defines some specific descriptive statistical tools and results.
Even if we never become a data detective or do statistical tests, we will be exposed and
bombarded with statistics and statistical outcomes. We need to understand what they are telling
us and how they help uncover what the data means on the “crime,” AKA research question/issue.
How we obtain these results will be covered in lecture 1-3.
Detecting
In our favorite detective shows, starting out always seems difficult. They have a crime,
but no real clues or suspects, no idea of what happened, no “theory of the crime,” etc. Much as
we are at this point with our question on equal pay for equal work.
The process followed is remarkably similar across the different shows. First, a case or
situation presents itself. The heroes start by understanding the background of the situation and
those involved. They move on to collecting clues and following hints, some of which do not pan
out to be helpful. They then start to build relationships between and among clues and facts,
tossing out ideas that seemed good but lead to dead-ends or non-helpful insights (false leads,
etc.). Finally, a conclusion is reached and the initial question of “who done it” is solved.
Data analysis, and specifically statistical analysis, is done quite the same way as we will
see.
Descriptive Statistics
Week 1 Clues
We are interested in whether or not males and females are paid the same for doing equal
work. So, how do we go about answering this question? The “victim” in this question could be
considered the difference in pay between males and females, specifically when they are doing
equal work. An initial examination (Doc, was it murder or an accident?) involves obtaining
basic information to see if we even have cause to worry.
The first action in any analysis involves collecting the data. This generally involves
conducting a random sample from the population of employees so that we have a manageable
data set to operate from. In this case, our sample, presented in Lecture 1, gave us 25 males and
25 females spread throughout the company. A quick look at the sample by HR provided us with
assurance that the group looked representative of the company workforce we are concerned with
as a whole. Now we can confidently collect clues to see if we should be concerned or not.
As with any detective, the first issue is to understand the.
Between Black and White Population1. Comparing annual percent .docxjasoninnes20
Between Black and White Population
1. Comparing annual percent of Medicare enrollees having at least one ambulatory visit between B and W
2. Comparing average annual percent of diabetic Medicare enrollees age 65-75 having hemoglobin A1c between B and W
3. Comparing average annual percent of diabetic Medicare enrollees age 65-75 having eye examination between B and W
4. Comparing average annual percent of diabetic Medicare enrollees age 65-75 having
Students will develop an analysis report, in five main sections, including introduction, research method (research questions/objective, data set, research method, and analysis), results, conclusion and health policy recommendations. This is a 5-6 page individual project report.
Here are the main steps for this assignment.
Step 1: Students require to submit the topic using topic selection discussion forum by the end of week 1 and wait for instructor approval.
Step 2: Develop the research question and
Step 3: Run the analysis using EXCEL (RStudio for BONUS points) and report the findings using the assignment instruction.
The Report Structure:
Start with the
1.Cover page (1 page, including running head).
Please look at the example http://www.apastyle.org/manual/related/sample-experiment-paper-1.pdf (you can download the file from the class) and http://www.umuc.edu/library/libhow/apa_tutorial.cfm to learn more about the APA style.
In the title page include:
· Title, this is the approved topic by your instructor.
· Student name
· Class name
· Instructor name
· Date
2.Introduction
Introduce the problem or topic being investigated. Include relevant background information, for example;
· Indicates why this is an issue or topic worth researching;
· Highlight how others have researched this topic or issue (whether quantitatively or qualitatively), and
· Specify how others have operationalized this concept and measured these phenomena
Note: Introduction should not be more than one or two paragraphs.
Literature Review
There is no need for a literature review in this assignment
3.Research Question or Research Hypothesis
What is the Research Question or Research Hypothesis?
***Just in time information: Here are a few points for Research Question or Research Hypothesis
There are basically two kinds of research questions: testable and non-testable. Neither is better than the other, and both have a place in applied research.
Examples of non-testable questions are:
How do managers feel about the reorganization?
What do residents feel are the most important problems facing the community?
Respondents' answers to these questions could be summarized in descriptive tables and the results might be extremely valuable to administrators and planners. Business and social science researchers often ask non-testable research questions. The shortcoming with these types of questions is that they do not provide objective cut-off points for decision-makers.
In order to overcome this problem, researchers often seek to answer o ...
Statistical Processes
Can descriptive statistical processes be used in determining relationships, differences, or effects in your research question and testable null hypothesis? Why or why not? Also, address the value of descriptive statistics for the forensic psychology research problem that you have identified for your course project. read an article for additional information on descriptive statistics and pictorial data presentations.
300 words APA rules for attributing sources.
Computing Descriptive Statistics
Computing Descriptive Statistics: “Ever Wonder What Secrets They Hold?” The Mean, Mode, Median, Variability, and Standard Deviation
Introduction
Before gaining an appreciation for the value of descriptive statistics in behavioral science environments, one must first become familiar with the type of measurement data these statistical processes use. Knowing the types of measurement data will aid the decision maker in making sure that the chosen statistical method will, indeed, produce the results needed and expected. Using the wrong type of measurement data with a selected statistic tool will result in erroneous results, errors, and ineffective decision making.
Measurement, or numerical, data is divided into four types: nominal, ordinal, interval, and ratio. The businessperson, because of administering questionnaires, taking polls, conducting surveys, administering tests, and counting events, products, and a host of other numerical data instrumentations, garners all the numerical values associated with these four types.
Nominal Data
Nominal data is the simplest of all four forms of numerical data. The mathematical values are assigned to that which is being assessed simply by arbitrarily assigning numerical values to a characteristic, event, occasion, or phenomenon. For example, a human resources (HR) manager wishes to determine the differences in leadership styles between managers who are at different geographical regions. To compute the differences, the HR manager might assign the following values: 1 = West, 2 = Midwest, 3 = North, and so on. The numerical values are not descriptive of anything other than the location and are not indicative of quantity.
Ordinal Data
In terms of ordinal data, the variables contained within the measurement instrument are ranked in order of importance. For example, a product-marketing specialist might be interested in how a consumer group would respond to a new product. To garner the information, the questionnaire administered to a group of consumers would include questions scaled as follows: 1 = Not Likely, 2 = Somewhat Likely, 3 = Likely, 4 = More Than Likely, and 5 = Most Likely. This creates a scale rank order from Not Likely to Most Likely with respect to acceptance of the new consumer product.
Interval Data
Oftentimes, in addition to being ordered, the differences (or intervals) between two adjacent measurement values on a measurement scale are identical. For example, the di ...
The document analyzes data from the 2009 ISSP survey on social inequality in Switzerland to examine factors influencing income levels. A structural equation model is used with income as the dependent variable, and factors like parents' jobs, education levels, and gender as predictors. The model finds the predictors have little significant effect on income. Most fit indexes show the model is not a good match for the data. The hypotheses and relationships between variables are rejected due to lack of evidence.
This document discusses measurement scales and scaling techniques used in marketing research. It describes four primary scales of measurement: nominal, ordinal, interval, and ratio scales. Nominal scales involve classifying objects into categories while ordinal scales involve ranking objects. Interval and ratio scales are metric and allow comparing distances. The document also explains various scaling techniques used in research like paired comparisons, ranking, constant sum, Q-sort, rating scales like Likert and semantic differential. It concludes with a brief discussion of reliability and validity in measurement.
Non-wage income is a big component of total income in America, yet is almost never analyzed in terms of inequality and discrimination. Here we use the Tobit method to determine the likelihood of a person earning Non-Wage income.
This document discusses descriptive statistics and exploratory data analysis. It defines descriptive statistics as procedures for summarizing quantitative data in a clear way, while exploratory data analysis involves examining data to understand its characteristics. The document outlines common descriptive statistics like the mean, median, mode, standard deviation, and frequency distributions. It also discusses examining distributions, central tendency, dispersion, and using SPSS to calculate descriptive statistics.
A teacher calculated the standard deviation of test scores to see how close students scored to the mean grade of 65%. She found the standard deviation was high, indicating outliers pulled the mean down. An employer also calculated standard deviation to analyze salary fairness, finding it slightly high due to long-time employees making more. Standard deviation measures dispersion from the mean, with low values showing close grouping and high values showing a wider spread. It is calculated using the variance formula of summing the squared differences from the mean divided by the number of values.
The 2012 American Community Survey (ACS), Table B22010, shows that approx. 7 million of the approx. 16 million households receiving Food Stamps/SNAPS nationwide have 1 or more family member living with at least 1 disability. Yup- that's approx. 44% (43.7%)
CHAPTER5 Analyzing Performance MeasuresThink of data as.docxtiffanyd4
CHAPTER
5
Analyzing Performance Measures
Think of data as the raw materials that you convert into information. Data by themselves, however, are not likely to be very useful. Column after column of numbers mean very little. To make the data meaningful you need to organize and present them, that is, the data need to become information. As you skim a newspaper, listen to a presentation, or read a report you may mistakenly assume that creating the graphs and statistics took little work. This is not true. Someone thought through how to present the data so that you and others could quickly understand and interpret them.
One of your tasks as a program manager is to decide how to organize and present data. There is no single best way to graph or analyze a set of data. You may create several graphs and try different statistics as you search for patterns that make sense. In this chapter we focus on the basic tasks for organizing performance data: entering data into a spreadsheet, creating tables and graphs, and describing variations in individual variables. The same skills apply to surveys, program evaluations, and community assessment, which we cover in later chapters. Also note that in Chapter 8 we will discuss analyzing relationships between and among variables. First, however, we will cover the terminology of measurement scales. Familiarity with these terms will facilitate our discussion of various statistics in this chapter and later.
MEASUREMENT SCALES
Measurement scales or levels of measurement describe the relationship among the values of a variable. You will find the terminology associated with measurement scales useful as you decide what statistics to use. The basic scales are nominal, ordinal, interval, and ratio scales.
Nominal scales identify and label the values of a variable. You cannot place the values of a nominal variable along a continuum; nor can you rank individual cases according to their values. Even though numbers are sometimes assigned, these numbers have no particular importance beyond allowing you to classify and count how many cases belong in each category. For example, imagine an organization, the Happy Housing Center, records why people seek its services. The variable Reason for Seeking Services has four values: “Laid off or lost job,” “Rental housing needs repairs,” “Rent increased,” “Eviction.” A nominal scale reports how many requests for services are in each category:
1 = Laid off or lost job
2 = Rental housing needs repairs
3 = Rent increase
4 = Eviction
The numbers are simply a device to identify categories; letters of the alphabet or other symbols could replace the numbers and the meaning of the scale would be unchanged. Remember, too, that values of nominal scales are not ranked. Thus, the numbering system in our example does not imply that an eviction has a greater or lesser value than being laid off.
Ordinal scales identify and categorize values of a variable and put the values in rank order. Ordinal scales r.
This document describes a lesson on measures of variation. The lesson introduces concepts like standard deviation and variance as measures of risk. Students will analyze stock return data for two stocks (A and B) and calculate summary statistics. They will discover that investing half in each stock reduces risk compared to investing fully in one stock, as the standard deviation is lower for a mixed portfolio. The lesson aims to show students that variation measures provide important information beyond just averages.
Lecture2 Applied Econometrics and Economic Modelingstone55
The document discusses various statistical measures used to summarize data, including the mean, median, mode, variance, and standard deviation. It provides examples calculating these measures in Excel using data on salaries of graduates and shoe sizes. It also discusses how measures of central tendency (mean, median, mode) may be misleading if the data is skewed, and how measures of variability (variance, standard deviation) are better indicators of the spread of non-symmetric data around the mean. Rules of thumb for how many data points fall within 1, 2, or 3 standard deviations of the mean are also examined for returns on the Dow Jones index.
lecture 1 applied econometrics and economic modelingstone55
This document discusses the key concepts in econometrics including:
1) Estimating economic relationships using statistical methods to understand the effects of things like advertising on sales or stock prices.
2) Testing economic hypotheses to determine if policies are effective or if demand is elastic.
3) Forecasting economic variables to project things like firm sales, energy demand, or government revenues.
Can CEO compensation be justified, at least statistically?Elias Sipunga
The document discusses a statistical analysis of factors that influence CEO compensation in large UK businesses. Univariate analyses found CEO salaries were highly skewed and not normally distributed. Bivariate tests showed CEOs who also serve as board chairperson earn significantly higher salaries on average (£481,286.51) than CEOs who do not chair the board (£281,149.70). Independent t-tests found this difference in mean logged salaries between the two groups was statistically significant.
Can CEO compensation be justified, at least statistically?Elias Sipunga
This document summarizes the results of a statistical analysis conducted to determine the factors that influence CEO compensation. The analysis found that CEOs who also serve as the chairperson of the board earn significantly higher salaries than CEOs who do not serve as chair. Regression analysis showed that excess returns and sales growth were the strongest predictors of CEO compensation, explaining over 80% of the variance. The number of directors on the board was also a significant predictor, with larger boards correlated with slightly lower CEO pay. Performance-based variables like excess returns and sales growth had the strongest correlation with CEO compensation.
Data Analysis for Graduate Studies SummaryKelvinNMhina
This document provides guidance on analysing qualitative and quantitative data. For qualitative data, it discusses preparing the data, identifying concepts and themes, and ensuring quality analysis. Key strategies for qualitative analysis include open coding, classification, and conceptual frameworks. For quantitative data, the document outlines recording, describing, and managing the data using techniques such as frequency counts, cross-tabulation, t-tests, chi-squared tests, and measures of central tendency and correlation. Examples are provided for coding, entering, and presenting both types of data.
initial postWhat are the characteristics, uses, advantages, and di.docxJeniceStuckeyoo
initial post
What are the characteristics, uses, advantages, and disadvantages of each of the measures of location and measures of dispersion? Discuss them with examples
first reply
Measures of location and measures of dispersion are two different ways of describing quantitative variables. Measures of location are often known as averages. Measures of dispersion are often known as a variation or spread. Both measures are helpful with describing statistical information. (Lind, Marchal, & Wathen, 2015)
The different measures of location include: the arithmetic mean, the median, the mode, the weighted mean, and the geometric mean. All of these measures of location pinpoint the center of a distribution of data. An advantage of measures of location is that the averages show us the central value of the data. A disadvantage of only using measures of location is that we may not draw an accurate conclusion because an average does not tell the spread of the data. Some examples of using measures of location include: finding the average price of a concert ticket, finding the average age of homeowners in a community, finding the averages shoe size of boys between the ages of 13-19, and finding the average amount of money people spend on food annually. (Lind, Marchal, & Wathen, 2015)
The different measures of dispersion include: the range, the variance, and the standard deviation. All of these measures of dispersion tell us about the spread of the data and it helps us compare the spread in two or more distributions. Advantages of using measures of dispersion are that it gives us a better idea of the range in which an average was calculated, and it is easy to calculate and understand. A disadvantage of using measures of dispersion is that it is a broad measurement because it only shows the maximum and minimum values of data. For example, the salaries of dentists in the state of Georgia might range from $70,000-$120,000 (just a made up example – not necessarily accurate data). This information is great for someone to know the range of dentist salaries, but it lacks in showing specific information about dentists’ salaries. (Lind, Marchal, & Wathen, 2015)
Lind, D. A., Marchal, W. G., & Wathen, S. A. (2015). Statistical techniques in business & economics. New York, NY: McGraw-Hill Education.
Second Reply
What are the characteristics, uses, advantages, and disadvantages of each of the measures of location and measures of dispersion? Discuss them with examples.
These are the measures in common use of location and dispersion: arithmetic mean, median, mode, weighted mean, and geometric mean. The arithmetic mean, median, and mode The mean usually refers to the arithmetic mean or average. This is just the sum of the measurements divided by the number of measurements. We make a notational distinction between the mean of a population and the mean of a sample. The general rule is that Greek letters are used for population characteristics and Latin letters ar.
The document analyzes data from a credit case study to identify patterns relating to loan repayment. It finds that loan purposes for repairs and an income type of working have higher rates of unsuccessful payments. Housing in a co-op apartment is also associated with higher difficulty in repayment. The analysis recommends banks focus on contract types like student and pensioner, as well as housing types other than co-op apartments, to increase successful payments. It also suggests caution when providing loans for repairs while emphasizing focusing on housing with parents, houses/apartments, and municipal apartments.
This document provides an overview of key numerical measures used to describe data, including measures of central tendency (mean, median, mode) and measures of dispersion (range, variance, standard deviation). It defines each measure, provides examples of calculating them, and discusses their characteristics, uses, and advantages/disadvantages. The document also covers weighted means, geometric means, Chebyshev's theorem, and calculating measures for grouped data.
1. 1
STUDENT EXAMINATION NUMBER Y1401956
MODULE NO: MAN00029M
MODULE TITLE: Quantitative Methods & Data Analysis
Module Tutor: Dr. Harry Venables
Essay Title: Final assessment
Word Count: ___2688_________
2. 2
Task 1
In order to start performing any manipulation with data the Data View and
Variable View in SPSS should comply with the rules so that SPSS output would
compute properly.
Name Label Values Type New
type
Rationale
Obs Id number None Nominal Scale Variable items that are not
measurable but are
numeric like ID numbers
and phone numbers (can
also be Nominal).
Gender Gender 0=Female,
1=Male
Nominal Nominal Variable items are all
numbers that represent
categories and have no
order to them, e.g. 1-Blue
Car, 2-Cat, 0-Male, 11-
Female, etc.
Age Age(years) None Nominal Scale Variable items are all
measurable numbers e.g.
height in cm.
Status Marital Status 1=Single,
2=Married,3=
Divorced,
4=Widowed
Nominal Nominal Variable items are all
numbers that represent
categories and have no
order to them, e.g. 1-Blue
3. 3
Car, 2-Cat, 0-Male, 11-
Female, etc.
Occupa
tion
Occupation 1=Student,2
=Employed,
3=Self-
employed,
4=Retired
Nominal Nominal Variable items are all
numbers that represent
categories and have no
order to them, e.g. 1-Blue
Car, 2-Cat, 0-Male, 11-
Female, etc.
AvgMon
thlySpe
nding
AverageMonthly
Spend (GBP)
None Nominal Scale
(custom
currency
)
Variable items are all
measurable monetary
values.
Monthly
Visits
Number of Monthly
visits
None Nominal Scale Variable items are all
measurable numbers e.g.
height in cm.
Distanc
e
Distance Travelled
(miles)
None Nominal Scale Variable items are all
measurable numbers e.g.
height in cm.
Car Vehicle Ownership 0=No,1=Yes Nominal Nominal Variable items are all
numbers that represent
categories and have no
order to them, e.g. 1-Blue
Car, 2-Cat, 0-Male, 11-
Female, etc.
Appreci Customer 1=Very Low, Nominal Ordinal Variable items are
4. 4
ation Appreciation 2=Low,
3=Indifferent,
4=High,
5=Very High.
numbers that represent
some form of ranking or
order, e.g. Likert scale
values 1-5, 1-7.
Task 2
A. The bar chart indicates the target consumers of the FreshCo retail centre and
consumer’s two characteristics are analysed: status and occupation. So, cross-
tabulation (Table 2.1) is used in order to analyse two variables and produce an
appropriate bar chart.
Table 2.1
Table 2.2
Table 2.2 shows that there are 201 repondents and 2 modes.
5. 5
From this chart a conclusion could be drawn that the majority of FreshCo’s
cosnumers are employed (116 out of 201 repsondents) and married ( 89 out of 201
reposndents) . Thus, Bimodal attribute is married and employed because of occuring
most frequently ( appendix 2)( Field,2009:21).
B. The target-consumer analysis contains previous charachteristics such as status
and occupation and the differeneces between them but in regard to car
owership.
7. 7
Divorced that are either employed or self-employed and widowed people that
are retired are groups that don’t own a car. Divorced people,especially self-
employed sub-group is the largests group that doesn’t own a car. Employed and
married on the contrary is the largest group to own a car. Single students are the
second largest and single employed is the smallest group to own a car.
C. Considering the fact the majority of FreshCo’s cutomers are married employed
car owners potential issues such as enough numbers of parking slots could
arise. Also, the retial’s convenient opening hours could make a significant
difference for working idividuals. Marrital status can also indicate the presence of
children and need for children facilities such as playgrounds and food courts on
the site.
8. 8
Task 3
A. Consumer spending in regard to consumer charachteristics.
Extreme values (outliers) occur for student males and self - employed females.
Outliers are the extreme values that deviate from the rest of the responses. In this
case three respondents have outstanding answers on the average monthly
spending. The numbers over outliners indicate the row - number of the respondent
(SAGE, 2015). In the self-employed female group one person has higher monthly
spendings (522.59 $) than the rest. In the male student group two respondents
spend more than the rest of the group (219.84 $ and 225.84 $).
Medians are dispersed in terms of occupation. Whereas, in terms of gender
medians are not significantly different (they overlap). Both employed and self-
9. 9
employed males and females spend more money than other groups( the difference
is significant because their confidence intervals don’t overlap). There is also some
difference between the employed and self-employed group because boxes of these
groups almost don’t overlap (with employed and self-employed women there is less
difference in spending because the boxes slightly overlap). Lower median position
shows lower spending for the self-employed group than employed group that has
higher median position. The interquartile ranges are of slightly different length and
have different positions which indicates different dispersion of data between the two
groups (self-employed group is smaller) (Field, 2009:100-2).
There are no significant differences between expenditure of students and
retired individuals because their boxes overlap. However, the spread of student
interquartile range differs across gender. The male group is smaller and less likely to
spend more money than women (Ibid).
B. Average monthly expenditure according to level of appreciation.
10. 10
The average expenditure for customers with ‘very high’ customer appreciation
differs significantly from the rest because the median and the box (incl. confidence
intervals) are far away from the rest and don’t overlap. Interquartile range of the ‘very
low’ and ‘low’ appreciation is very different from all the rest which indicates a wider
dispersion of data (Field, 2009:101-2).
Some box plots show the skewness of data and lack of symmetry which
needs to be observed more closely through a statistical test.
12. 12
Descriptive test shows the means as well as medians of ‘low’, ‘very low’,
‘indifferent’ and ‘high’ appreciation are not significantly different from each other.
However, the mean (as well as median- 199.9549) of ‘very high’ appreciation is
significantly lower (197.8941) and differs from the rest.
Standard deviation from the mean also differs for ‘very low’ and ‘low’ which
numerically shows a wider variety of indicators deviating from the mean. The
interquartile range of these both groups also significantly differs from the rest.
We can also observe slight positive skewness for ‘high’ and ‘indifferent’ groups and
slight negative skewness for the rest that indicates slight asymmetry of the data
distribution (Field, 2009:19).
Customers that tend to spend the least amount have the highest customer
appreciation. Customers that spend the most are indifferent or have low or very low
appreciation.
Task 4: Distribution of customer’s monthly expenditure.
13. 13
According to the histogram the data for average monthly spending is not
normally distributed. We observe a flat distribution with a negative skew. The bars
are out from the normal curve and have an obvious split in two. In a Normal Q-Q Plot
we see some deviation towards the tail. Normal QQ-Plot is a chart of the observed
values plotted against normalized expected values. The data values are pretty far
away from the line and even cross it which shows that distribution is not normal. The
spending data only around 180.000 spending value and 5300.00 spending value is
normally distributed. Generally, values don’t follow the normal distribution. Detrended
14. 14
plot is another view of the first that detrends the line. It shows even more closely the
abnormality of distribution.
The box plot doesn’t show any outliers. Central section of the data is not
centrally distributed because the median is not centrally placed.
Distribution of distance travelled
The Distance travelled data seems more normally distributed. However, if we
see the Normal Q-Q Plot than we can see slight deviation towards the end of the tail.
Detrended Normal Q-Q plot shows a closer look which reveals that the data is not as
15. 15
normally distributed as it looks like. The box plot has three outliers. Median is almost
centrally placed, so, central section of data is almost centrally distributed.
The normality of distribution is hard to indicate without carrying out the test of
normality.
From the table we can spot skewness which indicates abnormal distribution in
both cases. In the first case more (-.900) than in the other (.612).
16. 16
According to Shapiro-Wilk test (which is more reliable), Sig. (p < than
0.05) shows that the data in both average monthly spending and distance travelled is
not normally distributed. The null hypothesis here is that the data that is given has no
difference from that of the normal distribution. The hypothesis test rejects it.
Significance p-value is less than level of significance. In this case .000 and .001 are
smaller than 5% (0.05); therefore, the null hypothesis is rejected and we conclude
that the data is not normally distributed and that Distance travelled has less of
normality deviation than the monthly spending data.
Task 5: Significance of age-gender difference.
a. It is assumed that the data is normally distributed, which suggests a
parametric test in a form of a T-test. T-test is used when there are “two
experimental conditions and different participants used in each condition” (Field,
2009:334).
The null hypothesis (H0) is that there is no significant differences
between the age and gender variables. Alternative hypothesis (H1) would be that
there is a significant differences between same variables.
P-value indicates level of probability at which we accept or reject the
hypothesis (Ibid). P value has to be linked to the direction of hypothesis we are
testing. If probability p < 0.05 (5%) it means that the H0 is rejected and the variances
have significant difference. If p> 0.05 (5%) the H0 is not rejected. After the analysis
of variances the second step involves the analysis of means. If the previous test
doesn’t show significant differences and we do not reject H0 of the previous test then
we should look at the first row of the Independent Sample Test.
B.
17. 17
There was 126 female respondents and 75 male respondents. According
to group statistics males have higher age average (41.53 years old) than women
(37.65).
In order to carry out the analysis of the test and see if variances are
different in different groups we should look at the Levene’s test for the p-value (Sig.).
In this case p= .248 > 0.05 (5%); so, we accept (or rather not reject) the null
hypothesis (H0). Thus, there is no significant difference between the variances of the
groups. Accordingly, we look at the first row (Equal variances assumed) of the T-test
for equality means. Second row is disregarded (Field, 2009:340). P=.000 < 5% (Sig.
2-tailed); so, we reject the null hypothesis for the mean variable. This means that
there is a difference in the mean between the groups, so, we have to look at the
mean differences row.
To conclude, there is significant difference in the mean but not in the
variance. Significance measure shows that there is a difference in average age for
different sexes. The mean difference is negative which means group 2 (males) is the
largest group.
18. 18
The normality test was also carried out to support the T-test and reveal
detailed data on age average across sexes and differences between these
averages. The test below supports the rationale behind choosing the parametric test
over non-parametric test due to normality of distribution.
19. 19
The normality table supports the assumption that the data is normally distributed
(and that the T-test is appropriate). H0 is that data has no significant difference from
normal distribution. P values for both males and females are bigger than 5%.
(p=.511> 0.05 and p=.135>0.05) which ensures the normality of distribution. Charts
below also support the perfect normality of distribution which means that T-test was
used correctly.
21. 21
Task 6
The task investigates the customers feedback connected to the
customer’s level of appreciation. It also compares the level of appreciation across
different genders of consumers. Appropriate test of association would be Chi-square
test ( for two or more samples) that is used when one group is dependent on the
other in order to measure relationship between the attribute variable (investigates
relationship among attribute variables, usually nominal and ordinal variables that can
be grouped or ranked) (Venables,2015,w3 p3).
22. 22
Firstly, because we have two unrelated samples we need to make a
Crosstabs table and indicate the null hypothesis (H0) and an alternative hypothesis
(H1).
H0 - would be that customer appreciation does not depend on gender (gender
influences on customer appreciation level).
H1 - would be that customer appreciation depends on gender.
Count or observed frequency are results from variable groups. Expected
count or frequency is calculated in the table by using row and columns totals.
Expected frequencies in each cell have to be higher than 5 to avoid misleading
results, so there would be no issues in the count (Bryman, Cramer,2009:155).
In the table above standardised risiduals are within +/- 3 gap which
shows reliability of the test and its normality.
Accroding to the table, however, it is hard to tell the customer
appreciation level depending on gender because the number of female reposndents
is higher (126 total) than of male respondents (75 total). So, the dependancy is not
evident without the Chi-square.
23. 23
Looking at the Pearson Chi-square test P= .505 > 0.05 (5%), so, we do
not reject the H0 and conclude that customer appreciation does not depend on
gender. These two groups are independent of one another.
Task 7: Customer behaviour
A. Measuring variables against each other.
24. 24
Correlation indicates the direction and strenght of the reltionship between
variables. It shows interdependence of variables and observes direct, null and
inverse relationships.Each point represents respondents position in relation to the
two varibales being measured (Bryman, cramer, 2009: 212).
In this matrix plot we can see the majority of scattered patterns with
random distribution and some weak form of correlation except one case with an
obvious inverse curvilinear and negative relationship (Bryman, cramer,2009:215).
The diagonal with no data are values against themselves which indicates perfect
correlation (where p=0). If we look at the lower triangle ( which mirros the upper
triangle) we can see potentially strong correlation between Distance Travelled and
Number of Monthly Visits data because the scatter is very close. The rest of data has
25. 25
random patterns and distribution without any direction which indicates weak
correlation or lack of such. To conclude, from SPSS test we can interpret that with
the decrease of distance travelled there is an increase of monthly visits.
B. Before applying the Pearson’ Correlation test we should make sure that
the graph is linear because according to the scattered matrix plot the two
variables (Monthly Visits and Distance Travelled) have a curvilinear relationship
(shape of the relationship is not straight and curves at some point), so it is non-
linear; thus, “it is not appropriate to apply a measure of linear correlation like
Pearson’s r” (Bryman, Crymer, 2009: 214).
In order to use Pearson test the correlation should be linear and the two
selected variables should be normally distributed. Firstly, we need to transform
an independent variable into a logarithmic scale to perform a valid Pearson
correlation test (Ibid) and test the assumption of normal distribution. Otherwise,
the outcome would be insignificant and could show errors.
Testing normality (appendix 2)
According to Schapiro-Wilk test of normality where p < 0.05 shows the
data on both average distance travelled (Sig. = .001) and number of monthly
visits (Sig. = .000) is not normally distributed. This rejects the null hypothesis
26. 26
which states that the data that is given has no difference from that of the
normal distribution.
In this case two variables are not linear or normally distributed. Despite
the adjustment of transformation of logarithmic scale the test might not
provide a meaningful outcome.
Performing linear correlation
In order to measure correlation we have to explore covariance that
indicates how variables vary together. Pearson’s correlation coefficient (P) describes
covariance. If P=1, then it means that there is absolute positive correlation between
the two variables x and y. If P<0 then, there is a direct relationship between the
variables x and y. If P=0 then there is no direct relationship between variables; and if
P<0 then there is an inverse relationship present between the variables. We use a
Pearson test also because it’s a continuous data (Venables, 2015, l4, p2).
As we can observe in the table by looking at P there is an inverse
relationship between the two variables since P<0. They also have a strong negative
relationship because p= - .871 (close to -1) (Bryman, Cramer, 2009:217).
27. 27
Regressions
Regression analyzes the cause-effect relationship between multiple
variables taking into account the accuracy of measures and outliers (Ibid, 229).
Null hypothesis is that the regression is not significant. Alternative hypothesis is that
the regression is significant.
Large values of R square value indicates that the regression model fits
the data, small values indicate poor explanation (Field, 2009:268). R2=.556 which
proves that the regression model fits the data and the regression line fits the scatter
plot.
If we look at ANOVA test there is a relationship between the two
variables because ( sig. dfference) P<5%. Thus, the results in the coefficients table
are valid and the model is appropriate. SSm is large (SSR smaller) which means that
the model is able to exlain variable’s behaviour better than its mean.
28. 28
H0 here is that the constant does not play a significant role within the
model. Same for the number of visits. P value for both is less than 5% and rejects
the hypothesis. So, we accept the model’s prediction. Respectively, β indicates that
with the increase of 1 mile there is a decrease of number of visits (-.574 number of
visits per 1 mile).
C. Multiple regression analysis
Small values of R square value indicates that the regression model gives
a poor explanation of data that is less likely to fit the data. R2=.011 which is small
and proves that the regression line doesn’t fit the data or the scatter plot (Ibid).
29. 29
With the ANOVA test it is evident that the coefficient table is not reliable
because the P value is bigger than 5% which indicates non-reliability of the
regression model.
β cannot be taken into account because each of P values are higher than
5% which accepts the null hypothesis that age, distance and number of visits don’t
play a significant role within the model. Other independent variable should be
introduced in order to predict customer’s monthly expenditure.
32. 32
Works Cited
Bryman A., Cramer D., Quantitative Data Analysis with SPSS 14, 15 & 16: A Guide
for Social Scientists.
Field A., 2009, Discovering Statistics Using SPSS, 3d edition, SAGE Publication Ltd
SAGE publications, 2015, Identifying and Addressing Outliers, Module 5
Available at: http://www.sagepub.com/upm-data/52387_MOD_5.pdf
Accessed on 10/05/2015
Venables, 2015, Quantitative Methods and Data Analysis (i) (MAN00029M) P/G
Module, University of York