Upcoming SlideShare
×

# Statics for the management

4,620
-1

Published on

STATISTICS FOR MGT..... SMU ASSIGNMENT.. SEM --1 ... BY ROHIT MISHRA

Published in: Technology
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
4,620
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
270
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Statics for the management

1. 1. Master of Business Administration Semester 1 STATISTICS FOR MANAGEMENT Set- 1 1. (a) ‘Statistics is the backbone of decision-making’. Comment (b) ‘Statistics is as good as the user’. CommentAns. (a) ‘Statistics is the backbone of decision-making’Due to advanced communication network, rapid changes in consumer behavior, variedexpectations of variety of consumers and new market openings, modern managers have adifficult task of making quick and appropriate decisions.Therefore, there is a need for them to depend more upon quantitative techniques likemathematical models, statistics, operations research and econometrics.Decision making is a key part of our day-to-day life. Even when we wish to purchase atelevision, we like to know the price, quality, durability, and maintainability of various brandsand models before buying one. As you can see, in this scenario we are collecting data andmaking an optimum decision. In other words, we are using Statistics.Again, suppose a company wishes to introduce a new product, it has to collect data on marketpotential, consumer likings, availability of raw materials, feasibility of producing the product.Hence, data collection is the back-bone of any decision making process.Many organizations find themselves data-rich but poor in drawing information from it.
2. 2. Therefore, it is important to develop the ability to extract meaningful information from raw datato make better decisions. Statistics play an important role in this aspect.Statistics is broadly divided into two main categories. Below Figure illustrates the twocategories. The two categories of Statistics are descriptive statistics and inferential statistics.•Descriptive Statistics:Descriptive statistics is used to present the general description of data which is summarizedquantitatively. This is mostly useful in clinical research, when communicating the results ofexperiments.•Inferential Statistics:Inferential statistics is used to make valid inferences from the data which are helpful in effectivedecision making for managers or professionals.Statistical methods such as estimation, prediction and hypothesis testing belong to inferentialstatistics. The researchers make deductions or conclusions from the collected data samplesregarding the characteristics of large population from which the samples are taken.So, we can say ‘Statistics is the backbone of decision-making’. (b) ‘Statistics is as good as the user’Statistics is used for various purposes. It is used to simplify mass data and to make comparisonseasier. It is also used to bring out trends and tendencies in the data as well as the hidden relationsbetween variables. All this helps to make decision making much easier. Let us look at eachfunction of Statistics in detail.
3. 3. 1. Statistics simplifies mass dataThe use of statistical concepts helps in simplification of complex data. Using statistical concepts, the managers can make decisions more easily. The statistical methods help in reducing the complexity of the data and consequently in the understanding of any huge mass of data. 2. Statistics makes comparison easierWithout using statistical methods and concepts, collection of data and comparison cannot bedone easily. Statistics helps us to compare data collected from different sources. Grand totals,measures of central tendency, measures of dispersion, graphs and diagrams, coefficient ofcorrelation all provide ample scopes for comparison. 3. Statistics brings out trends and tendencies in the dataAfter data is collected, it is easy to analyze the trend and tendencies in the data by using thevarious concepts of Statistics. 4. Statistics brings out the hidden relations between variablesStatistical analysis helps in drawing inferences on data. Statistical analysis brings out the hidden relations between variables. 5. Decision making power becomes easier With the proper application of Statistics and statistical software packages on the collected data, managers can take effective decisions, which can increase the profits in a business. Seeing all these functionality we can say ‘Statistics is as good as the user’.
4. 4. 2. Distinguish between the following with example.(a) Inclusive and Exclusive limits.(b) Continuous and discrete data.(c) Qualitative and Quantitative data(d) Class limits and class intervals.Ans. (a) Inclusive and Exclusive limits.Inclusive and exclusive limits are relevant from data tabulation and class intervals point of view.Inclusive series is the one which doesnt consider the upper limit, for example,00-1010-2020-3030-4040-50In the first one (00-10), we will consider numbers from 00 to 9.99 only. And 10 will beconsidered in 10-20. So this is known as inclusive series.Exclusive series is the one which has both the limits included, for example,00-0910-1920-2930-3940-49Here, both 00 and 09 will come under the first one (00-09). And 10 will come under the nextone. (b) Continuous and discrete data.All data that are the result of counting are called quantitative discrete data. These data take ononly certain numerical values. If you count the number of phone calls you receive for each day ofthe week, you might get 0, 1, 2, 3, etc.All data that are the result of measuring are quantitative continuous data assuming that we can
5. 5. measure accurately. Measuring angles in radians might result in the numbers p/6, p/3, p/2, p/,3p/4, etc. If you and your friends carry backpacks with books in them to school, the numbers ofbooks in the backpacks are discrete data and the weights of the backpacks are continuous data. (c) Qualitative and Quantitative data:Data may come from a population or from a sample. Small letters like x or y generally are usedto represent data values. Most data can be put into the following categories:• Qualitative• QuantitativeQualitative dataQualitative data are the result of categorizing or describing attributes of a population. Hair color,blood type, ethnic group, the car a person drives, and the street a person lives on are examples ofqualitative data. Qualitative data are generally described by words or letters. For instance, haircolor might be black, dark brown, light brown, blonde, gray, or red. Blood type might be AB+,O-, or B+. Qualitative data are not as widely used as quantitative data because many numericaltechniques do not apply to the qualitative data. For example, it does not make sense to find anaverage hair color or blood type.Quantitative dataQuantitative data are always numbers and are usually the data of choice because there are manymethods available for analyzing the data. Quantitative data are the result of counting ormeasuring attributes of a population. Amount of money, pulse rate, weight, number of peopleliving in your town, and the number of students who take statistics are examples of quantitativedata. Quantitative data may be either discrete or continuous.All data that are the result of counting are called quantitative discrete data. These data take ononly certain numerical values. If you count the number of phone calls you receive for each day ofthe week, you might get 0, 1, 2, 3, etc.Example 2: Data Sample of Quantitative Continuous DataThe data are the weights of the backpacks with the books in it. You sample the same fivestudents. The weights (in pounds) of their backpacks are 6.2, 7, 6.8, 9.1, 4. 3. Notice thatbackpacks carrying three books can have different weights. Weights are quantitative continuousdata because weights are measured.
6. 6. 3. In a management class of 100 student’s three languages are offered as an additionalsubject viz. Hindi, English and Kannada. There are 28 students taking Hindi, 26 takingHindi and 16 taking English. There are 12 students taking both Hindi and English, 4 takingHindi and English and 6 that are taking English and Kannada. In addition, we know that 2students are taking all the three languages. If a student is chosen randomly, what is the probability that he/she is not taking any of these three languages? If a student is chosen randomly, what is the probability that he/ she is taking exactly one language?Ans.a) Our sample space is all the students in the school.There are 100 students, so the size of our sample space is 100.Our event is that a student drawn at random is not taking any language classes. Call this event AP (A) = the number of ways A could happen / the size of the sample space= the number of students taking no language class / 100So we must find the number of students who are not taking any language class.Let H be the number of students taking Hindi, E be the number of students taking English, and Kbe the number of students taking Kannada.We draw a Venn diagram.
7. 7. Let students taking Kannada as language be S(K) = 28Let students taking Hindi as language be S(H) = 28Let students taking English as language be S(E) = 28Let students taking Kannada and English be S (K E ) = 12Let students taking Hindi and English be S ( H E ) = 4
8. 8. 4. List down various measures of central tendency and explain the difference betweenthem?Ans. Central tendencyThis tutorial uses histograms to illustrate different measures of central tendency. A histogram is atype of graph in which the x-axis lists categories or values for a data set, and the y-axis shows acount of the number of cases falling into each category. For example, if there are 59 men and 48women in your class, you could represent the information with this histogram:The categories may be non-numeric, as in the histogram above, or may be numeric, as in thefollowing histogram. The x-axis shows the ages for respondents to a survey and the y-axisreports the frequency or count for occurrences of each age.From the histogram, can you determine what is the "typical" age of the participants in thesurvey? This question could be answered in several different ways, depending on what youreally want to know.
9. 9. Do you want to determine: The average of the ages? The age which divides the cases into two equal-sized groups -- the "highs" vs. the "lows"? The most common age?Questions like these are concerned with determining the central tendency of a group of numbersor data. To answer our question, we want a single number which can somehow represent all ofthe ages of the people who participated in the survey.Ways to Measure Central TendencyThe three most commonly-used measures of central tendency are the following.mean The sum of the values divided by the number of values--often called the "average." Add all of the values together. Divide by the number of values to obtain the mean.Example: The mean of 7, 12, 24, 20, 19 is (7 + 12 + 24 + 20 + 19) / 5 = 16.4.median The value which divides the values into two equal halves, with half of the values being lower than the median and half higher than the median. Sort the values into ascending order. If you have an odd number of values, the median is the middle value. If you have an even number of values, the median is the arithmetic mean (see above) of the two middle values.Example: The median of the same five numbers (7, 12, 24, 20, 19) is 19.mode The most frequently-occurring value (or values). Calculate the frequencies for all of the values in the data. The mode is the value (or values) with the highest frequency.
10. 10. Example: For individuals having the following ages -- 18, 18, 19, 20, 20, 20, 21, and 23, themode is 20.Check your understanding of these concepts by calculating the mean, median, andmode of the following three sets of numbers.Which Measure Should You Use?This histogram shows the distribution of the number of siblings for survey respondents. Themode (i.e., most common number of siblings) is easy to find. Can you also determine themedian simply by inspection? What about the mean?You should see two copies of the histogram. The upper histogram allows you to drag the redvertical line to help locate the median. Numbers on either side of the red line show you howmany values exist above and below the line.The lower histogram allows you to move a triangle within the range of the distribution whichacts like a fulcrum for a see-saw. The mean is located at the point where the histogram isbalanced. Use these tools -- the red vertical line and the fulcrum -- to find the median and meanof the data.Now write down which of these three measures of central tendency (mean, median, or mode)you think best describes the "typical" number of siblings of the respondents. Explain why youchose the one you did.You can use the histogram activity to explore other variables from the the 1993 General SocialSurvey. The available variables appear under the "Dataset" menu in the histogram window. Lookat several of the variables, and use the tools to find the mean and median for each one.Notice that not all measures of central tendency are appropriate for all kinds of variables. Forexample, For nominal data (such as sex or race), the mode is the only valid measure. For ordinal data (such as salary categories), only the mode and median can be used.Now explain in your own words how the three measures of central tendency differ from oneanother. In the space below, briefly answer the following three questions: 1. Why is the mean not appropriate for some types of data? 2. When do you want to use the median rather than the mean? 3. When would the mode be most appropriate?
11. 11. 5. Define population and sampling unit for selecting a random sample in each of thefollowing cases.a) Hundred voters from a constituencyb) Twenty stocks of National Stock Exchangec) Fifty account holders of State Bank of Indiad) Twenty employees of Tata motors.Ans.Statistical survey or enquiries deal with studying various characteristics of unit belonging to agroup. The group consisting of all the units is called Universe or PopulationSample is a finite subset of a population.A sample is drawn from a population to estimate the characteristics of the population. Samplingis a tool which enables us to draw conclusions about the characteristics of the population.In sampling there are two types namely discrete and the other is the continuous. Discretesampling is that the data given are of the finite and their calculations are made easy. Continuoussampling is one where the data are of infinite form. Its intervals are indicated by <, >, greaterthan but lesser than, lesser than and greater than.The finite number of items in a sample is size. In practice samples greater than 30 are largesamples and if less it is small samples.A measure associated with the entire population is called as population parameter or just aparameter.Given a population, suppose we consider all possible samples of a certain size N that can bedrawn from the population.For each samplesupposewe compute a statistic such as mean, standard deviation etc. Thesesample vary fromsample to sample. We group these different statistics according to their frequencies which iscalled as frequency distributiontoformed so called as sampling distribution. Standard deviation of asampling distributionis called its standard error.
12. 12. Suppose we draw all possible samples of a certain size N from apopulation and find the mean of X bar of each of these samples.Frequency distribution of these means is called as sampling distribution of means.If the population is infinite, then, be the standard deviation and mean respectively then thestandard deviation denoted by is given by = / sqrt of NIs used to calculate the standard normal variate for the population where its size is more than 30.
13. 13. 6. What is a confidence interval and why it is useful? What is a confidence level?Ans.Under a given hypothesis H the sampling distribution of a statistic S isnormal distribution with the meanA normal distribution with the mean and the standard deviation thenZ=is the standard normal variate associated with S sothat for the distribution of z the mean iszero and the standard deviation is 1. Accordingly for z the Z% confidence level is ( -z c , zc) thismeans t hat we can be Z% confi dent that if the hypot hesi s H is t rue t han t hevalue of z lie between –zc and zc. This is equivalent saying that when H is t r u e t h e r e i s( 1 0 0 – Z ) % c h a n c e t h a t t h e v a l u e o f z l i e s o u t s i d e t h e interval (-zc . zc)if wereject a true hypothesis H on the grounds that the value of z lies outside the interval (-z, zc) wewould be making a type 1error and the probabil it y of m aking this error is (100 -Z)%t he l evel of significance.Confidence level is very much useful as we can predict any assumptions can be made so that itwill not lead us to the wrong way even if it doesn’t be so great. As explained the confidence levelis between –zc to z and the peak is at 100% which is the best. In some cases we predict but donot consider it , and sometimes we will not predict but hypothesis need it so this is called as theTYPE 1 errors and TYPE 2 errors. According to the levels of the Z the confidence is assured. Inthe above the field shaded portion is the critical region.
14. 14. Assignment Set- 21. (a) What are the characteristics of a good measure of central tendency? (b) What are the uses of averages?Ans. a). The characteristics of a good measure of central tendency are:Present mass data in a concise formThe mass data is condensed to make the data readable and to use it for further analysis.•Facilitate comparisonIt is difficult to compare two different sets of mass data. But we can compare those two aftercomputing the averages of individual data sets.While comparing, the same measure of average should be used. It leads to incorrect conclusionswhen the mean salary of employees is compared with the median salary of the employees.•Establish relationship between data setsThe average can be used to draw inferences about the unknown relationships between the datasets. Computing the averages of the data sets is helpful for estimating the average of population.•Provide basis for decision-makingIn many fields, such as business, finance, insurance and other sectors, managers compute theaverages and draw useful inferences or conclusions for taking effective decisions.The following are the requisites of a measure of central tendency:•It should be simple to calculate and easy to understand•It should be based on all values•It should not be affected by extreme values•It should not be affected by sampling fluctuation•It should be rigidly defined•It should be capable of further algebraic treatment
15. 15. b) Appropriate Situations for the use of Various Averages1. Arithmetic mean is used when:a. In depth study of the variable is neededb. The variable is continuous and additive in naturec. The data are in the interval or ratio scaled. When the distribution is symmetrical2. Median is used when:a. The variable is discreteb. There exists abnormal valuesc. The distribution is skewedd. The extreme values are missinge. The characteristics studied are qualitativef. The data are on the ordinal scale3. Mode is used when:a. The variable is discreteb. There exists abnormal valuesc. The distribution is skewedd. The extreme values are missinge. The characteristics studied are qualitative4. Geometric mean is used when:a. The rate of growth, ratios and percentages are to be studiedb. The variable is of multiplicative nature5. Harmonic mean is used when:a. The study is related to speed; timeb. Average of rates which produce equal effects has to be found.
17. 17. 8. Decide to either fail to reject the null hypothesis or reject it in favor of thealternative. The decision rule is to reject the null hypothesis if the observed value is in thecritical region, and to accept or "fail to reject" the hypothesis otherwise.It is important to note the philosophical difference between accepting the nullhypothesis and simply failing to reject it. The "fail to reject" terminology highlights the fact thatthe null hypothesis is assumed to be true from the start of the test; if there is a lack ofevidence against it, it simply continues to be assumed true. The phrase "accept the nullhypothesis" may suggest it has been proved simply because it has not been disproved, a logicalfallacy known as the argument from ignorance. Unless a test with particularly high power isused, the idea of "accepting" the null hypothesis may be dangerous. Nonetheless the terminologyis prevalent throughout statistics, where its meaning is well understood. Alternatively, if thet est ing procedure forces us t o rej ect the null h ypot hesis (H-null ), we can accept t he alternative hypothesis (H-alt) and we conclude that the researchhypothesis is supported by the data. This fact expresses that our procedure is based onprobabilistic considerations in the sense we accept that using another set could lead us to adifferent conclusion.
18. 18. 3. The upper and the lower quartile income of a group of workers are Rs 8 and Rs 3per dayrespectively. Calculate the Quartile deviations and its coefficient?Ans. Quartile Deviation: It is based on the lower quartile and the upper quartile. The difference is called the inter quartile range. The differencedivided by is called semi-inter-quartile range or the quartile deviation. ThusQuartile Deviation (Q.D)In this question = 3 and = 8Q . D = 8 - 3 2 = 2 . 5Here Quartile deviation is Rs 2.5 per day.Coefficient of Quartile Deviation Coefficient of Quartile Deviation:A relative measure of dispersion based on the quartile deviation is called the coefficient ofquartile deviation. It is defined as Here In this question = 3 and = 8Coefficient of Quartile Deviation is Rs 0.455 per day0.455=511=Coefficient of Quartile Deviation=28+328 – 3.
19. 19. 4. The cost of living index number on a certain data was 200. From the base period, thepercentage increases in prices were—Rent Rs 60, clothing Rs 250, Fuel and Light Rs 150and Miscellaneous Rs 120. The weights for different groups were food 60, Rent 16, clothing12, Fuel and Light 8 and Miscellaneous 4.Ans. Arranging the data in tabular form for easy representation ITEM P W(Wt) wP RENT 60 16 960 CLOTHING 250 12 3000 FUEL & LIGHT 150 8 1200MISCELLANEOUS 120 4 480 FOOD - 60 60 ∑ W=100 ∑ wP=5700P 01= ∑wP ∑ W= 5700100 = 57Hence living Index No is 57.
20. 20. 5. Education seems to be a difficult field in which to use quality techniques. One possibleoutcome measures for colleges is the graduation rate (the percentage of the studentsmatriculating who graduate on time). Would you recommend using P or R charts toexamine graduation rates at a school? Would this be a good measure of Quality?Ans.In statistical quality control, the p-chart is a type of control chart used to monitor the proportionof nonconforming units in a sample, where the sample proportion non conforming is defined asthe ratio of the number of nonconforming units to the sample size, n. The p-chart onlyaccommodates “pass"/"fail"-type inspection as determined by one or more go-no go gauges ortests, effectively applying the specifications to the data before they’re pl otted on t he chart .Ot her t ypes of cont rol char t s di spl ay t he m agni tude of t he qual it ycharacteristic under study, making troubleshooting possible directly from those charts.Some practitioners have pointed out that the p-chart is sensitive to the underlyingassumptions,using control limits derived from the binomial distribution rather than from the observed samplevariance. Due to this sensitivity to the underlying assumptions, p-charts are often implementedincorrectly, with control limits that are either too wide or too narrow,leading to incorrect decisions regarding process stability. A p-chart is a form of the Individualschart (also referred to as "XmR" or "ImR"), and these practitioners recommend the individualschart as a more robust alternative for count-based data.R Chart:Range charts are used when you can rationally collect measurements in groups (subgroups) ofbetween two and ten observations. Each subgroup represents a “snapshot" of the process at agiven point in time. The charts x-axes are time based, so that the charts show a history of theprocess. For this reason, you must have data that is time-ordered; that is, entered in the sequencefrom which it was generated. If this is not the case, then trends or shifts in the process may not bedetected, but instead attributed to random (common cause) variation. For subgroup sizes greaterthan ten, use X-bar / Sigma charts, since the range statistic isap o o r est im ator of process si gm a for l arge subgroups. In fact , the subgroups i gm a is ALWAYS a better estimate of subgroup variation than subgroup range. Thepopularity of the Range chart is only due to its ease of calculation, dating to its use before theadvent of computers. For subgroup sizes equal to one, an Individual-X / Moving Range chart canbe used, as well as EWMA or Cu Sum charts. X-bar Charts are efficient at detecting relativelylarge shifts in the process average, typically shifts of +-1.5 sigma or larger. The larger thesubgroup, the more sensitive the chart will be to shifts, providing a Rational Subgroup can beformed.Hence, R Chart will be a good measure of quality instead of P chart.
21. 21. 6. (a) Why do we use a chi-square test? (b) Why do we use analysis of variance?Ans. Chi-square testC hi -S quare t est i s a non -param et ri c t est. It i s used t o t est the i ndependenceo f attributes, goodness of fit and specified variance. The Chi-Square test does not require anyassumptions regarding the shape of the population distribution from which the sample wasdrawn. Chi-Square test assumes that samples are drawn at random and external forces, if any, acton them in equal magnitude. Chi-Square distribution is a family of distributions. For everydegree of freedom, there will be one chi-square distribution. An important criterionfor appl yi ng t he Chi -S quare t est is t hat the sample size should be very large. None ofthe theoretical expected values calculated should be less than five. The important applications ofChi-Square test are the tests for independence of attributes, the test of goodness of fitand the test for specified variance.The chi-square (c2) test measures the alignment between two sets of frequency measures. Thesemust be categorical counts and not percentages or ratios measures (for these, use anothercorrelation test). Note that the frequency numbers should be significant and be atleast above 5(although an occasional lower figure may be possible, as long as they are not a part of a patternof low figures).Goodness of fit: A common use is to assess whether a measured/observed set of measuresfollows an expected pattern. The expected frequency may be determined from prior knowledge(such as a previous years exam results) or by calculation of an average from the given data. Thenull hypothesis, H0 is that the two sets of measures are not significantly different.Independence: The chi-square test can be used in the reverse manner to goodness of fit. If thetwo sets of measures are compared, then just as you can show they align, you can also determineif they do not align. The null hypothesis here is that the two sets of measures are similar.The main difference in goodness-of-fit vs. independence assessments is in the use of the ChiSquare table. For goodness of fit, attention is on 0.05, 0.01 or 0.001 figures. For independence, itis on 0.95 or 0.99 figures (this is why the table has two ends to it).