Upcoming SlideShare
×

# Qm 0809

592 views

Published on

ssfsfs

Published in: Education, Technology
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
592
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
9
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Qm 0809

1. 1. QUANTITATIVE ANALYSIS
2. 2. Introduction: COURSE OBJECTIVE: • • The objective of this paper is to acquaint the students with various statistical tools and techniques used to business decision making • • • Unit-I Construction of frequency distributions and their analysis in the form of measures of central tendency and variations; types of measures, their relative merits, limitations and characteristics; skewness : meaning and co-efficient of skewness. • • • • • Unit-II Correlation analysis - meaning & types of correlation, Karl Pearson’s coefficient of correlation and spearman’s rank correlation; regression analysis -meaning and two lines of regression; relationship between correlation and regression co-efficients. Time series analysis - measurement of trend and seasonal variations; time series and forecasting. •
3. 3. Introduction: • • • • • Unit-III Probability: basic concepts and approaches, addition, multiplication and Bayes’ theorem. Probability distributions - meaning, types and applications, Binomial, Poisson and Normal distributions. Unit-IV Tests of significance; Hypothesis testing; Large samples, Small samples: Chi-square test, Analysis of variance.
4. 4. STATISTICS THE SCIENCE OF COLLECTING, ORGANIZING, PRESENTING, ANALYZING AND INTERPRETING DATA TO ASSIST IN MAKING MORE EFFECTIVE DECISIONS
5. 5. Data Sources – – – – Primary Secondary Advantages and Limitation of both the approach Methods of Primary Data collection • • • • Survey Interview Experimentation Observation
6. 6. Statistical Concepts • Population- The entire set of individuals or objects of intrest • Sample- A portion or part of the population of interest • Why we do sampling?
7. 7. Purpose of Sampling • To contact the whole population would be time consuming • Cost would be prohibitive • Physical Impossibility of checking all items in a population- Like we cannot test all water in Ganga for pollution • Destructive Nature of some tests- stress test • Sample Results are adequate
8. 8. Characteristics of a Good Sample • • • • Representativeness Adequate Size Replication Precision of research study matched to sample precision
9. 9. Different Types of Sampling • • • • • Random or Statistical Sampling Convenience Sampling Purposive Sampling Snowball Sampling Multistage sampling
10. 10. Different Types of Random Sampling • Simple Random Sample: A sample selected so that each item or person in the population has the same chance of being included
11. 11. Different Types of Random Sampling • Systematic Random Sample: A random starting point is selected and the every kth member of the population is selected
12. 12. Different Types of Random Sampling • Stratified Random Sample: A population is divided into subgroups, called strata and a sample is randomly selected from each stratum
13. 13. Different Types of Random Sampling • Cluster Sample: A population is divided into clusters using naturally occurring geographic or other boundaries. The clusters are randomly selected and a sample is collected by randomly selecting from each cluster. Suppose, we divide delhi into 6 regions (E, W, N, S, SE and others) and randomly selected 3 regions N, E, SE and take sample of residents in each of the region.
14. 14. Types of Variables Two Types of Basic Variable • Qualitative : non-numeric (Gender, Religion, State of Birth, color of skin) • Quantitative: numeric
15. 15. Types of Variables Qunantitative • Discrete : can assume only certain values and there are gaps between the values. (number of rooms in a house, no of children) • Continuous: Can assume any value within a specified range (CGPA, rate of interest, weight of an individual)
16. 16. Measurement Measurement means assigns numbers or other symbols to characteristics of objects according to certain pre-specified rules. We donot measure objects, but some characteristics of it. In marketing research, we donot measure consumers, but their perceptions, attitudes, preferences. Numbers permit statistical analysis and it faciliate communication of measurement rules and results.
17. 17. Levels of Measurement Data can be classified according to the level of measurements. Level of measurement dictates the type of calculation that can be done to summarise and present the data. There are 4 levels of measurement: • Nominal • Ordinal • Interval • Ratio
18. 18. Levels of Measurement NominalObservation of a qualitative variable can only be classified and counted. There is no particular logical order to the labels. Like: Colour of chocolate bar Gender of students
19. 19. Levels of Measurement Ordinal1.Data classifications are represented by sets of labels or names (high, medium low) that have relative values. 2. Because of the relative value, the data classified can be ordered or ranked 3. But we cannot say the magnitue of difference between the labels. Like: Rating
20. 20. Levels of Measurement Interval1. Next Higher level- it possses all the qualities of ordinal level, in addition the difference between values is a constant size. Like: Temperature, dress size
21. 21. Levels of Measurement Ratio1. Practically all quantitative data is recorded in ratio level. It is highest level. In addition, the 0 point is meaningful and the ratio between two numbers is meaningful Like: Money, weight (0 point in the scale is important and ratio of two numbers is important. If A is earning \$20,000, and B is earning \$40,000, then B is earning twice as much as A)
22. 22. Hypothesis • A statement about a population parameter subject to verification • Data are used to check the reasonable of the hypothesis • In statistical analysis, we make a claim (hypothesis), collect the data and use the data to test the hypothesis
23. 23. Hypothesis • Hypothesis are derived form research problem and research questions. Hypothesis should pass the following test • Relevant is pertinent to the issue; provides new insights; if true, helps explain what’s going on • Specific is detailed enough to provide value and direction, but not so general as to be “universal true-isms” • Testable can be fully investigated within the time and resources available; is not stated in the future tense, since it is not possible to test the future • Coverage together a set of hypotheses is “necessary & sufficient” to completely answer the issue question
24. 24. Forming Proper Hypothesis Apply the RSTC test to the following issue and hypotheses • Issue: What criteria are most important to business travelers selecting a hotel? • Hypotheses: – Spacious rooms with upgraded features, broadband access and premiere loyalty programs are features most desired by business travelers. – Business travelers will demand better hotel service in the future. – Vacation travelers prefer all-inclusive resorts by oceans or mountains by a 2-to1 margin. – To grow revenue and profit, Canyon Sky Hotels must get itself included on corporate and travel agent preferred hotel lists.
25. 25. Forming Proper Hypothesis Apply the RSTC test to the following issue and hypotheses • Issue: What criteria are most important to business travelers selecting a hotel? • Hypotheses: – Spacious rooms with upgraded features, broadband access and premiere loyalty programs are features most desired by business travelers. – Business travelers will demand better hotel service in the future. – Vacation travelers prefer all-inclusive resorts by oceans or mountains by a 2-to1 margin. – To grow revenue and profit, Canyon Sky Hotels must get itself included on corporate and travel agent preferred hotel lists.
26. 26. Types of Hypothesis Null Hypothesis: A statement about the value of a population parameter developed for the purpose of testing numerical evidence Alternate Hypothesis- A statement that is accepted if sample data provides sufficient evidence that the null hypothesis is false. Example: The mean age of Indian commercial aircarft is 20 years. Ho: Mean = 20 H1: Mean not equal to 20 = sign always appear in Null Hypothesis but never in alternate hypotheis
27. 27. Hypothesis testing A procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement. Five step procedure of Testing a Hypothesis: • State Null and alternate hypothesis • Select a level of significance • Identify the test statistic • Formulate a decision rule • Take a sample, arrive at a decision • Donot reject Ho or (reject Ho and accept H1)
28. 28. Hypothesis testing A procedure based on sample evidence and probability theory to determine whether the hypothesis is a reasonable statement. Five step procedure of Testing a Hypothesis: • State Null and alternate hypothesis • Select a level of significance • Identify the test statistic • Formulate a decision rule • Take a sample, arrive at a decision • Donot reject Ho or (reject Ho and accept H1)
29. 29. Hypothesis testing Level of Significance: The probability of rejecting the null hypothesis, when it is true. It is also called the level of risk. The researcher needs to select level of significance for his tests, generally for consumer research it is .05, for quality assurance it is 0.01 and for polling, it is 0.1
30. 30. Hypothesis Type 1 Error: Rejecting the Null Hypothesis, when it is true. Type II error: Accepting the null hypothesis when it is false.
31. 31. Hypothesis One Tailed and Two tailed test of significance
32. 32. Statistics Study of statistics is divided into two categories: Descriptive Statistics Inferential Statistics
33. 33. Statistics Descriptive Statistics: Method of oraganising, summarizing and presenting data in an informative way Inferential Statistics- the method used to estimate the property of a population based on a sample
34. 34. Describing Data -Qualitative Frequency Table: A grouping of qualitative data into mutually exclusive classes showing the number of observations in each class. The number of observations in each class in called class frequency
35. 35. Describing Data- Qualitative Bar Chart: A graph in which the classes are reported on the horizontal axies and the class frequencies on the vertical axis. The class frequencies are proportional to the heights of the bars.
36. 36. Describing Data-Qualitative Pie Chart: A chart that shows the proportion of percent that each class represents of the total number of frequencies
37. 37. Describing Data- Quantitative Frequency distribution: A grouping of data into mutually exclusive classes showing the number of observations in each class. Class interval- Difference between lower limit of the class and lower limit of the next class. Class midpoint- halfway between the lower limit of the two consecutive classes/
38. 38. Descriptive Statistics Measurements of Central Tendency Mean: Weighted Mean: Weakness of mean- it gets impacted by one or two very large or small values – in that case, mean might not represent the appropriate average data.
39. 39. Descriptive Statistics Measurements of Central Tendency Median: Midpoint of the values after they have been ordered from the smallest to the largest, or the largest to the smallest. Advantages: 1. It is not impacted by extremely large or small values. 2. It can be computed for ordinal level data or higher.
40. 40. Descriptive Statistics Mode : the value of the observation that appear most frequently. Advantages: 1. It is not impacted by extremely large or small values. 2. It can be computed even for nominal. Disadvantage: For some data, there is no mode.
41. 41. Descriptive Statistics Measurements of Dispersion Dispersion is the tendency of the individual values in a distribution to spread away from the average
42. 42. Descriptive Statistics Measurements of Dispersion Why do we need to measure dispersion? Mean, Median, mode- only describes the centre of the data, but it doesnot tell about the spread of the data.
43. 43. Dispersion Suppose, you are crossing a river and the mean depth is 3 meters. There are two scenarios: A. Depth of the river ranges from 3.25 to 2.75 B. Depth of river ranges from 6 to 1
44. 44. Dispersion A small value for a measure of dispersion indicates that the data are clustered closely and the mean is considered representative of data. A large measure of dispersion indicates that the mean is not reliable
45. 45. 2nd use of Dispersion We can compare the spread in two data sets. Example: Factory output
46. 46. Measure of Dispersion Range: Range = Largest Value- Smallest value Coeficient of Range: (H-L)/(H+L) The problem is that it is based on only two values, largest and smallest and doesnot consider other values
47. 47. Measure of Dispersion Mean Deviation: The arithmetic mean of the absolute values of the deviation from the mean. Example:
48. 48. Measure of Dispersion Standard Deviation and variance: Variance is the most popular method of dispersion. Variance: The arithmetic mean of the squared deviation from the mean. Std Deviation is the square root of the variance.
49. 49. Other measures of Dispersion Here, we try to determine the location of values that divide a set of observations into equal parts. They are called quartiles, deciles and percentiles. Quartiles divide a set of observations into 4 equal parts. First quartile is called Q1
50. 50. Skewness It talks about the shape of the data. There are four shapes commonly observed: symmetric, positively Skewed, negatively skewed and bimodal. Bimodal has two or more peaks. Peason’s coefficient of skewness sk = (3(Mean- median))/ standard deviation. It would be 0, when mean = median and it can vary from -3 to +3
51. 51. Statistical Inference Study of two variables (also called Multivariate Data Analysis) Relationship between two variables – – – Is the relationship strong or week? Is it direct or Inverse? Can we develop an equation to express the relationship between two variables?
52. 52. Typical Examples • • • Is there a relationship between the amount HUL spends per month on advertising and its sales in that month? Is there a relationship between the number of hours students studied for an exam and the score earned? Two most widely used analysis are corelation and regression
53. 53. Correlation • • • A group of techniques to measure the association between the two variables Plotting the data in scatter diagram Examples of sales call and sales made
54. 54. Dependent and Independent Variable Dependent Variable- The variable that is being predicted or estimated. It is scaled on Y-axis. Independent Variable- The variable that provides the basis for estimation. It is the predictor variable. It is plotted on X-Axis
55. 55. The coefficient of Correlation Originated by Karl Pearson in 1900, the coefficient of correlation describes the strength of the relationship between two sets of interval or ratio scaled variables. Designated r, it is called Pearson’s r and can assume any value from -1 to +1
56. 56. The coefficient of Correlation • What does a correlation coefficient of +1 mean? • What does a correlation coefficient of -1 mean? • What does a correlation coefficient of 0 mean?
57. 57. How to Calculate Correlation coefficeint
58. 58. How to Calculate Coefficient of determination The proportion of the total variation in the dependent variable Y that is explained or accounted for, by the variation in the independent variable X. It is computed by squaring the coefficient of correlation. This is more precise measure, instead of strong, weak correlation. Like 57.6% of the variation in the number of copiers sold is explained or accounted for by the variation of number of sales call
59. 59. • • • Correlation and cause Spurious correlation When we find that two variables with a strong correlation is that there is a relationship or association between two variables, not that a change in one causes a change in the other.
60. 60. Probability Special Rules of Addition: P(A or B) = P(A) + P(B) P(A or B or C)= P(A) + P(B) + P(C) To apply this rule, the events must be mutually exclusive
61. 61. Probability General Rules of Addition: When events are not mutually exclusive P(A or B) = P(A) + P(B)- P(A and B) P(A and B) here is called Joint probability, a probability that measures the likelihood of two or more events will happen cocurrently. The concept of Venn Diagram
62. 62. Probability Special Rules of Multiplication: This is for combining two events, likelihood that two events both happen (example: a person is 21 years old and buy Pepsi) When two or more events are independent P (A and B)= P(A) P(B) P(A and B and C)= P(A) P(B) P(C)
63. 63. Probability Independence: The occurrence of one event has no effect on the probability of the occurrence of another event
64. 64. Probability General Rule of Multiplication: P(A and B) = P(A) P(B/A) For two events, A and B, the joint probability that both events will happen is found by multiplying the probability that event A will happen by the conditional probability of event B occurring given that A has occurred.
65. 65. Probability Conditional Probability: The probability of a particular event occurring, knowing that another event has occurred.
66. 66. Bayes’ Theorem P(A1/B) = P(A1) P(B/A1) P(A1)P(B/A1)+ P(A2) P(B/A2) Prior Probability- The initial probability based on the present level of information Posterior Probability- A revised probability based on additional information