Basic concepts of statistics

51 views
13 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
51
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Basic concepts of statistics

  1. 1. BASIC CONCEPTS OF STATISTICS by : DR. T.K. JAIN AFTERSCHO ☺ OL centre for social entrepreneurship sivakamu veterinary hospital road bikaner 334001 rajasthan, india FOR – PGPSE / CSE PARTICIPANTS [email_address] mobile : 91+9414430763
  2. 2. My words..... My purpose here is to give a few questions on fundamentals of statistics. I welcome your suggestions. I also request you to help me in spreading social entrepreneurship across the globe – for which I need support of you people – not of any VIP. With your help, I can spread the ideas – for which we stand....
  3. 3. What were the root words of statistics ? Latin = status Germany = statistik Italian = statista french = statistique
  4. 4. Who carried out first cencus of the world ? Pharaoh (over 1000 years before Christ)
  5. 5. What are the subjects where statistics has application ? Every subject including the following : business management economics commerce industry etc.
  6. 6. What are 2 major sources of data? 1. primary data : collected for the first time during the research 2. secondary data : which are already available published data – they were collected for some other purpose, but they can be used for the present research
  7. 7. From where can we get secondary data ? 1. industry report 2. previous researches 3. published data 4 annual reports 5. statistical department 6. directories / reports / data bases
  8. 8. From where can we get primary data ? 1. interview 2. survey 3. schedule / questionnaire 4. observation 5. experimentation
  9. 9. What do we do after collection of data ? Scrutinise = remove data which are defective then we arrange them we try to tabulate them for this we have to fix classification of data then we we can prepare graphs / tables / charts from data and then we can analyse data
  10. 10. Why do we classify data ? After classification, data can easily be analysed. We can easily interpret data
  11. 11. How can we classify data ? 1. chronologically (data wise / year wise) 2. geographically (north v/s south zone) 3. qualitative ( order = like first, second, third) 4. quantitative data analysis (use of tools for quantitative analysis)
  12. 12. Name a few international bodies that publish data (which we can use as secondary source of data)? IBRD, IMF, ADB, ILO, UNO, WTO, WHO etc.
  13. 13. What is the difference between primary and secondary data ? Primary data is first hand original in nature whereas secondary data is in the form of compilation of existing data or already published data. The collection of primary data involves huge resources in terms of money and time, finance and energy whereas secondary data is relatively less costly. Primary data is usually collected by keeping in mind the purpose for which it is collected so its suitability will be more in comparison to secondary data
  14. 14. What is difference between census and sample survey ? Under the census or complete enumeration method, data are collected for each and every unit of the population or universe which is a complete set of items which are of interest in any particular situation in sample, we pick up only a few items and from them we collect data. So reliability is less comparatively
  15. 15. What are the steps in presentation of data ? Classification of data (put data in classes) Tabulation of data (prepare table from data) Frequency distribution of data (identify frequencies) Diagrammatic presentations of data (prepare diagrams) Graphic representation of data (prepare graphs).
  16. 16. What do you understand from tabulation ? Tabulation is a systematic and logical arrangement of data in columns and rows in accordance with some salient features and characteristics.
  17. 17. What are the parts of a table ? Table Number Title of the Table Sub-title or Head Note Captions and stub Body Footnotes Source Note
  18. 18. What is class limit ? The end numbers or the highest and lowest values that can be included in a class interval are known as the class limits of that class. For example, in above table 40-50 and 80-100 are the lower and upper class limits.
  19. 19. What is class interval ? It is the difference between the upper limit and lower limit of the same class. The lower limit of a class is usually represented by symbol I1 and upper limit by I 2 .
  20. 20. What is Class frequency ? The number of observations included in a particulars class is known as the frequency of that class.
  21. 21. It refers to that classification where both the class limits are included in the class itself while determining the class intervals.
  22. 22. What are the 3 methods of data presentation ? 1. textual presentation = present data in the form of text – write reports etc. 2. graphical presentation = prepare graphs, pie chart, bar chart, histo gram etc. 3. tabular presentation : prepare tables of data for better analysis
  23. 23. POPULATION ?? All the elements of set, which are of the interest of researcher
  24. 24. Statistical inference The process of using data obtained from a sample to make estimate or test hypothesis about the characteristics of the population
  25. 25. Qualitative data ? Data that are labels or names used to identify categores of items
  26. 26. Quantitative data ? The data that indicate how much and how many ?
  27. 27. Frequency distribution ?? A tabular summary of data showing number and frequency of each of nonoverlapping classes
  28. 28. What is median ? Measure of central location, when data are arranged in ascending order
  29. 29. What is mode ? Value with greatest frequency
  30. 30. What is percentile ? When some % of value are above some specified value, it is called that percentile 50 th percentile = median = 2 nd quartile
  31. 31. Quartile ? 25% data sets we have 3 quartile 1 st quartile = 25% 2 nd quartile = 50% 3 rd quartile = 75%
  32. 32. Range ? Measure of variability largest - smallest value
  33. 33. Interquartile range (3 rd quartile - 1 st quartile)
  34. 34. Variance ? Squared deviations of data from mean
  35. 35. Standard deviation Positive square root of variance
  36. 36. Coefficient of variance ? Standard deviation / mean * 100
  37. 37. Z score (Xi – mean) / standard deviation it is a standardised value = showing difference from mean + & - 1 standard deviation =68.27% + & - 2 standard deviation = 95.45% + & - 3 standard deviation = 99.73%
  38. 38. Empirical rule ? In a bell shaped distribution (normal distribution), we have data in 1 or 2 or 3 standard deviation to mean in some % of total data
  39. 39. Outlier ? Unusually small or unusually large data
  40. 40. Box plot Graphical presentation of data
  41. 41. Covariance ? Linear relation between two data sets positive or negative
  42. 42. Correlation coefficient ? Shows correlation between two variables from -1 to + 1
  43. 43. Weighted mean ? Data * weight give us weighted mean
  44. 44. Grouped data Data grouped as class interval as summarised by frequency distribution individual values are not available
  45. 45. Probability Likelihood that an event will occur
  46. 46. Experiment A process that generates a well defined outcome
  47. 47. Sample space Set of all experimental outcomes
  48. 48. Sample point Any experimental outcome
  49. 49. Tree diagram A graphical representation helpful in identifying the sample points of an experiment involving multiple steps
  50. 50. What is permutation & combination? Permutation = it denotes order / Sequence but combination = it only denotes that some objects are together example : ABC can have only one combination taking all of them together. But permutations are many : - ABC,ACB,BCA,BAC,CBA,CAB
  51. 51. What is relative frequency method ? Method of assigning probability on the basis of histrorical data
  52. 52. Subjective method of probability Method of assigning probability on the basis of judgement
  53. 53. Event A collection of sample points
  54. 54. Venn diagram Graphical representation showing sample space and operations involving events sample space = rectangle event = circle within sample space
  55. 55. What is formula of permutation ? Npr = n! / (n-r)! p=permutation n= total number of objects r=how many objects you are taking at a time ! = multiply with reducing numbers till it reaches 1 example : 5p5 = 5! / (5-5)! 5!=5*4*3*2*1 0! = 1 thus answer = 120 answer
  56. 56. How many different 4 digit letters can you make out of A,B,C,D,E? N = 5 (A,B,C,D,E) R = 4 formula = Npr = n! / (n-r)! =5!/(5-4)! = 120 answer
  57. 57. How many different 4 digit numbers can you make out of 1,2,3,4,0? N = 5 (1,2,3,4,0) R = 4 but 0 cannot come in the first digit for first digit we have 4 options (1,2,3,4), for next digits, we can use 0. thus we have 4*4*3*2*1 = 96 options OR formula = Npr = n! / (n-r)! =5!/(5-4)! but this contains all those numbers which start with 0. so let us keep 0 as fixed for 1 st digit and solve it. Now we have to pick up 3 digit out of 4 contd.
  58. 58. contd..... If it is not 0, permutation will be : formula = Npr = n! / (n-r)! =5!/(5-4)! = 120 Zero fixed for 1 st potion, we have these options : Npr = n! / (n-r)! n=4,r=3 4!/(4-3)! = 24 deduct this 24 from 120 120 -24 = 96 answer you can use any formula (out of these 2), you get the same answer
  59. 59. How many different 4 digit numbers can you make out of 1,2,3,4,0 which are divisible by 2? Start with 96 of the last question now pick up all those which are ending with 1 : 3*3*2*1 = 18 similarly those which are ending with 3 3*3*2*1 = 18 thus 96 – (18+18) = 60 seems to be the answer
  60. 60. In how many ways can Raj invite any 3 of his 7 friends? This is a question of combination. Here order (sequence) is not important, his friends can come in any order. Thus this is a case of combination. Formula : N! / ((n-r)!*r!) you can calculate combination by dividing permutation by r! =7! / ((7-3)!*3!) =(7*6*5)/(3*2*1) = 35 answer
  61. 61. How many different words can you frame from FUTURE ? Here we have two U total we have 6 digits. Formula : N ! / L! N= total number of digits L = those digits which are repeated. Answer = 6! / 2! = 360 answer
  62. 62. How many different words can you frame from DALDA ? Here we have two D & A total we have 5 digits. Formula : N ! / L! N= total number of digits L = those digits which are repeated. Answer = 5! / (2!*2!) = 30 answer
  63. 63. In how many ways can 8 person sit around a round table ? For questions relating to round table , we have to use the following formula : (n-1)! So here answer = (8-1)! = 7! =5040 answer
  64. 64. How many 4 digit numbers can be formed out of 1,2,3,5,7,8,9 if no digit is repeated. Total number ofdigits = 7 formula = Npr n =7 r 4 7p4 = 7! / 3! =7*6*5*4 = 840
  65. 65. How many numbers greater than 2000 can be formed from 1,2,3,4,5. No repeatition is allowed. 5 digit numbers = 5! = 120 4 digit numbers,: we cant take 1 in the beginning. We have 4 options for 1 st digit 4 for 2 nd digit 3 for 3 rd digit ... 4*4*3*2*1 = 96 total = 216 answer
  66. 66. There are 6 books on english, 3 on maths, 2 on GK. In how many ways can they be placed in shelf, if books of 1 subject are together? We have 3 subjects so 3! books of same subjects can be interchanged. So answer : 3!*6!*3!*2! =6*720*6*2 = 51840 answer
  67. 67. How many words can we make out of DRAUGHT, the vowels are never separated? Number of vowels = 2 other digits = 5 we will treat vowels as 1 word so we have 6!. Vowels can be interchanged so 2! so answer = 6!*2! = 1440 answer
  68. 68. In how many ways can 8 pearls be used to form a necklace ? In questions of necklace, we use the following formula : ½ (N-1)! Here we can take reverse order of left to right or right to left, so divide by ½ =1/2 (8-1)! =2520
  69. 69. In how many number of ways can 7 boys form a ring ? (7-1) ! = 6! = 720 answer
  70. 70. 50 different jewels can be set to form necklace in how many ways ? ½ ( n -1) ! = ½ (50 -1)! =1/2 (49)!
  71. 71. How many number of different digits can be formed from 0,2,3,4,8,9 between 10 to 1000? Let us assume that repeatition is not allowed Let us make 2 digit numbers : for first digit we have 5 option, for 2 nd digit also we have 5 options (including 0) = 25 for 3 digit numbers : 5*5*4 = 100 total 125 if repeatition is allowed : for 2 digit : 5 * 6 = 30 for 3 digit : 5*6*6 = 180 total = 210 answer
  72. 72. What is the number of permutations of 10 different things taking 4 at a time in which one thing never comes ? = 9 p 4 = (9*8*7*6) =3024
  73. 73. There are 5 speakers (A,B,C,D,E) , in how many ways can we arrange their speach that A always speaks before B For A and then B without gap : Let us take A and B as one. 4! = 24 for A and then B let us keep B at 3 rd place and A at 1 st place =3! there are total 6 such possibilities so we have 6*6 = 36 total possibilities = 60 answer
  74. 74. 5 persons are sitting in a round table in such a way that the tallest person always sits next to the smallest person? Keep tallest and smallest person as 1. we have (4-1)! = 6 the tallest and the smallest person can be interchanged = 2 =12
  75. 75. How many words can be formed from MOBILE so that consonent always occupies odd place ? There are 3 odd and 3 even places. We have 3! *3! =36 answer
  76. 76. In how many ways can we arrange 6 + and 4 – signs so that no two – signs are together? + + + + + + there are 5 places between 2 +. one on extreme left and one on extreme right. We have 7 positions for – sign 7c4 we have 6 places for 6 + sign, so we have 6c6 total = 35 answer
  77. 77. There are 10 buses between Bikaner and Jaipur. In how many ways can Gajendra go to Jaipur and come back without using the same bus in return journey? There are 10 options while going there are 9 options while returning (one bus used earlier will not be used) 10*9 = 90 answer
  78. 78. In how many ways can yamini distribute 8 sweets to 8 persons provided the largest sweet is served to Jigyasha? 1 sweet is fixed so we have 7! = 5040 answer
  79. 79. Yamini & Jigyasha go to a train and they find 6 vacant seats. In how many ways can they sit? Yamini has 6 options but Jigyasha has only 5 options left = 6*5 = 30 answer
  80. 80. How many words can you make from DOGMATIC? 8! 40320 answer
  81. 81. Gajendra has 12 friends out of whom 8 are relatives. In how many ways can he invite 7 in such a way that 5 are relatives? 8c5 * 4c2 =56*6 =336 answer
  82. 82. There are 8 points on a plane. No 3 points are on a straight line. How many traiangles can be made out of these ? 8c3 = 56 answer
  83. 83. In how many ways can you form a committee of 3 persons out of 12 persons ? 12c3 =220 answer
  84. 84. How many different factors are possible from 75600 ? The factors are : 2^4* 3^3*5^2 *7 formula = (number of factors +1) (number of factors +1) .... - 1 (4+1)(3+1)(2+1)(1+1) -1 =119 answer
  85. 85. A box contains 7 red 5 white and 4 blue balls. How many selections can be made that we pick up 3 balls and all are red? It is a question of combination. Total possibilities = 7c3 7c3 = 7*6*5 / 3*2*1 = 35 thus there are 35 chances of getting
  86. 86. A box contains 7 red 5 white and 4 blue balls. What is the probability that in our selections we pick up 3 balls and all are red? Total possibilities for red = 7c3 7c3 = 7*6*5 / 3*2*1 = 35 total possibility of 3 balls : 16c3 =(16*15*14/3*2*1) =560 probability - thus there are 35/560 chances of getting red in all the three selections
  87. 87. What is the probability of getting 3 heads when I toss a coin 5 times? This is a case of binomial probability (where there are only 2 outcomes possible, we can use this theory) Here we can use this formula : Ncr (p)^r * (q)^(n-r) =n =5, p = ½ q = (1-p) = ½ , r = 3 5c3 (1/2)^3*(1/2)^2 =5/48 answer
  88. 88. In how many ways can Gajendra invite some or all of his 5 friends in party hosted by him? (at least 1) Frmula of combination of 1 to all = 2^n – 1 = 2^5 - 1 = 32-1 =31 answer
  89. 89. How many words can be formed by using all the letters of the word DRAUGHT so that a. vowels always come together & b. vowels are never together? A There are 2 vowels. We treat them as 1. solution : 6!*2! = 1440 answer b. total possibilities = 7! = 5040 number of cases when vowels are not together = 5040-1440 = 3600 answer
  90. 90. In how many ways can a cricket eleven be chosen out of a batch of 15 players. 15c11 =15! / ((15-11)!*11!) =15!/(4!*11!) =(15*14*13*12)/(4*3*2*1) 1365 answer
  91. 91. In how many a committee of 5 members can be selected from 6 men 5 ladies consisting of 3 men and 2 ladies 6c3 *5c2 =[(6*5*4)/(3*2*1)] [(5*4)/(2*1)] =20*10 =200 answer
  92. 92. How many 4-letter word with or without meaning can be formed out of the letters of the word 'LOGARITHMS' if repetition of letters is not allowed 10p4 =(10*9*8*7) =5040 answer
  93. 93. how many ways can the letter of word 'LEADER' be arranged We have two e, so divide 6p6 by 2 6!/2! =720 / 2 =360 answer
  94. 94. How many arrangements can be made out of the letters of the word 'MATHEMATICS' be arranged so that the vowels always come together Let us treat all 4 vowels as 1 total digits are 11 we we take 11 – 4+1 = 8 digits vowels can be arranged among themselves = 4!/2! =8!/ (2!*2!) * 4!/2! = 120960 answer
  95. 95. In how many different ways can the letter of the word 'DETAIL' be arranged in such a way that the vowels occupy only the odd positions We have 3 odd and 3 even positions =3! *3! =36 answer
  96. 96. How many 3 digit numbers can be formed from the digits 2,3,5,6,7 and 9 which are divisible by 5 and none of the digits is repeated? Last digit must be 5 now we have 5 options for 1 st and 4 options for 2 nd digit =5*4 = 20 answer
  97. 97. In how many ways can 21 books on English and 19 books on Hindi be placed in a row on a self so that two books on Hindi may not be together? We have 22 places for Hindi books. 22p19 *21!
  98. 98. Out of 7 constants and 4 vowels how many words of 3 consonants and 2 vowels can be formed? Selection of 5 digits =7c3 *4c2 =35*6 = 210 5 digits can be arranged in 5! ways =120 total options : 210*120 = 25200 answer
  99. 99. What is effective rate of interest ? In the case of compound interest questions, the effective rate is generally higher than the rate. For example: if rate is 20% compounded quarterly , (4 times in a year) it will be equal to : (1+20/400)^4 =1.2155 so effectiveinterest here is 21.55% answer
  100. 100. What is present value ? When you are trying to find the present worth of some money which is due after some time, it is called present value. Due to factors like inflation, risk, uncertainity, present value is always less. Suppose you have to get 1100 after 1 year, at a discount rate of 10% its present value is 1000. (you can see here that there is a discount of 100) Money due – discount for time factor = present value
  101. 101. What is future value ? Future value takes up interest and therefore it is more than the sum invested. If I invest 1000 today, with an interest rate of 10%, it will become 1100 after 1 year.
  102. 102. Formula for present value ? Amount / (1+rate) ^ number of years suppose 1221 is due after 3 years and rate of interest is 10%, present value is : 1221 / (1+10/100)^3 =917.35 answer
  103. 103. What is the formula for future value ? Amount *(1+rate) ^ number of years suppose 1000 is invested for 3 years and rate of interest is 10% annually compounding, future value is : 1000 * (1+10/100)^3 =1331 answer
  104. 104. How to calculate EMI? You may use the formula for present value of annuity. Here you need a factor formula = ((1+rate)^n -1) / (rate(1+rate)^n) here n= number of instalments rate = rate % / number of instalments in a year*100 EMI = amout to pay / factor of annuity(calculated from above formula)
  105. 105. What will be EMI for Rs. 5 lakh rate of interest = 10%, payable in 20 annual instalments = ((1+rate)^n -1) / (rate(1+rate)^n) ((1+10/100)^20 - 1)/(10/100 (1+10/100)^20) =5.73/.67 =8.55 EMI=500000/8.55 =58479 ANSWER
  106. 106. What will be EMI for Rs. 5 lakh rate of interest = 10%, payable in Monthly instalments in 20 years. = ((1+rate)^n -1) / (rate(1+rate)^n) ((1+10/1200)^240 - 1)/(10/1200* (1+10/1200)^240) 6.328 / .061 =103.624 EMI = 500000 / 103.624 =4825 ANSWER
  107. 107. What is sinking fund ? If you deposit a sum of money every year and you are able to have a lot of money after some time this is sinking fund you create sinking fund to purchase a new machinary / building etc it is just reverse of the EMI (where you were looking at present value of annuities), because here you are taking future value of annuities.
  108. 108. How to calculate sinking fund contribution? For calculation of sinking fund contribution, we have to use the following formula : = ((1+rate)^n -1 )/(rate) here n = number of instalment rate = rate / number of instalments in a year*100.
  109. 109. Jigyasa has to collect 1 ml. After 5 years to start a new factory. How should she save every month? Rate = 12% = ((1+rate)^n -1 )/(rate) =((1+12/1200)^60 -1) / (12/1200) =.8167 / .01 dividing factor =81.669 monthly savings = 1000000/81.669 =12244.44 per month answer
  110. 110. What is a sample ? Instead of contacting every person, we may contact only a few persons, this is called sample. Suppose we go to check the quality of wheat to purchase. Instead of checking all the bags, we pick up one bag randomly and pick out a few grains, this is also a sample.
  111. 111. What are the methods of sampling ? 1. random sampling = purely by chance – just like a lottery 2. judgement sampling – here we are using some basis for judgement – the basis of judgement is related to our purpose of research. 3. quota sampling – taking some number of persons from each group 4. cluster sampling – here we divide populationin clusters (based on their geography / demography / location / etc.) and then pick up a few clusters (groups) of people and study them all
  112. 112. contd... Stratified ramdom sampling : here we divide population in different stratas (strata = population divided on some logical criteria) then we randomly take a few % of persons from each strata. Convenience sampling = taking sample on the basis of your convenience
  113. 113. What is confidence level ? It is the confidence created / associated with an interval estimate If we are using a confidence level of 95%, it means that there are 95% chances that our estimate will be close to population parameter (mean).
  114. 114. What is the difference between population parameters and sample statistics ? Population = actual population – but it is not possible to collect all the information about population due to our own resource constraints we dont have time or resources to collect data about population. Therefore we go for sample. When we use sample, we are using sample statistics. We try to estimate population parameters from sample statistics.
  115. 115. What is population parameter? If you go for census study (you contact each element in the population and take their data), you can calculate population parameter. There are different parameters which are of use like : mean, mode, median, standard deviation, etc. But we actually take sample so we estimate population parameters from sample statistics.
  116. 116. What is sample statistics? Sample characteristics like mean, mode, median, standard deviation etc. Which are used to estimate population parameter
  117. 117. What is sampling error? The difference in the value identified by sample and the population parameter is called sampling error. For example, population mean is 20 but sample mean is 18, so sampling error = 2
  118. 118. What is quantitative data and qualitative data ? quantitative data = data which tell about what and how much qualitative data=data which only contain nominal scale – just name / labels etc.
  119. 119. What are the various types of scales of data ? 1 nominal scale = only names are there – like ram, shyam 2. ordinal scale - they give order or ranks 3.interval scale: they have identifiable gaps, but they dont have zero 4. ratio scale – they can be used to calculate ratio – they have a zero and ratio can also be calculated, they are the best in numerical analysis
  120. 120. What are the various methods to present data ? Scatter chart / diagrams bar chart Histogram Ogive Dot plot etc.
  121. 121. What is statistical inference ? When we try to estimate or test hypothesis using sample data, it is called statistical inference (here we use sample data, not the population parameters).
  122. 122. What is a variable ? It is a characteristic of some interest relating to some element. It can take different values. Variables are denoted by X,Y,Z etc. Examples of variables are : for people = their education, for car=their car, fuel efficiency etc.
  123. 123. What is cross sectional data ? Data collected at the same point of time from different segments
  124. 124. What is cross tabulation? There are two variables, their data are presented in one table – one variable as X axis and other variable as Y axis for example : Age and Height or Marks and Attendance
  125. 125. Can we take up same element again in sampling ? Yes, it is possible (by chance) there are two types of sampling : 1. sampling with replacement 2. sampling without replacement in sampling with replacement, it is possible that by chance we may pick up same element again (we should avoid).
  126. 126. What is normal distribution ? There are many types of probabilty distributions, normal distribution is used most widely. It assumes that the data are bell shaped and mean=mode=median. Normal distribution assumes that most of the data are near mean and extreme data are very few.
  127. 127. How do you calculate mode ? Mode is that element, which has highest frequency if there is continuous data,you may use the following formula : Mode = L1 + (D1 / (D1+D2) * class interval) L1 = lower limit of the modal class D1=higest frequency – frequency in preceding class D2=higest frequency – frequency in succeeding class
  128. 128. Example of mode : 2,3,5,6,7,8,9,11,13,13,14,14,14,15,17,21,22,34,43 out of these mode is 14 (because its frequency is 3)
  129. 129. Example of mode ? Class frequency 10 to 20 4 20 to 30 8 30 to 40 12 40 to 50 4 apply the formula : modal class = 30 to 40 = 30 + ((12-8) / ((12-8)+(12-4)) * class interval = 30 + 4/12 * 10 = 30+3.3 = 33.3 answer
  130. 130. What is median ? Median = exact mid point in the data formula = n/2 or (n+1) / 2 example : 1,3,5,7,9 thre are 5 values, so n = 5 (5+1)/2 = 3 so 3 rd value is median. Median = 5 answer
  131. 131. Formula for median ? L1 + ((M-C) / F)* class interval L1 = lower limit F = frequency M=median = n/2 C = cumulative frequency of the previous class
  132. 132. Example of median ? Class frequency C.F 10 to 20 4 4 20 to 30 8 12 30 to 40 12 24 40 to 50 4 28 L1 + ((M-C) / F)* class interval M=28/2 = 14, so median class is 30 to 40 30 + (( 14-12)/12) * 10 =30+1.6 = 31.6 answer
  133. 133. What is cumulative frequency ? When you add up frequencies, it is called cumulative frequencies in the previous example , 10to 20 is 4, but 20 to 30 is shown as 16 (4 of 10 to 20 is added in it) cumulative frequency
  134. 134. What is relative frequency ? Formula = frequency of a class / number of items
  135. 135. Find mean, mode and median on following data ? Class freq. C.F x*f 10 to 20 5 5 75 20 to 30 12 17 300 30 to 40 12 29 420 40 to 50 5 34 225 total 34 1020 mean = 1020/34 =30,
  136. 136. solution... Median = 20+(17-5)/12 * 10 = 30 mode cannot be calculated because there are two equal modal values, so we use the following formula Mode = 3median – 2 mean mode = 30 answer k
  137. 137. Calculate rank correlation using the following data ? X Y 2 11 4 8 6 3 8 1
  138. 138. Solution Calculate their ranks X Y Rx Ry D^2 2 11 4 1 9 4 8 3 2 1 6 3 2 3 1 8 1 1 4 9 d=rx-ry so D^2 = (Rx-ry)^2 D^2 = 20
  139. 139. What is formula of quartile deviation ? (q3 – q1)/ 2
  140. 140. What is formula of coefficient of quartile deviation ? (q3-q1) / (q3+q1)
  141. 141. What is formula of coefficient of mean deviation ? Mean deviation / Median or mean deviation / mean
  142. 142. calculate combined standard deviation. Means A=8 B = 3, std. Deviation A = 2 B = 1 n1 of a = 20 n2 =30 Formula = sqrt ((n1s1 +n2s2 +n1d1+n2d2)/(n1+n2)) d1 = mean of a – combined mean d2 = mean of b -combined mean combined mean = (160+90)/50 = 5 d1=3 d2 =-2 sqrt ((20*2 +30*1 +20*3+30*(-2))/(20+30)) =1.18 answer
  143. 143. FORMULA OF RANK CORRELATION = 1- (6 ∑ D^2) / (N^3 -N) = 1 – (6*20)/(64 -4) =1 - 120/60 =1-2 =-1 Thus two series have perfectly negative correlation
  144. 144. What is sample space? A set of all experimental outcomes is called sample space
  145. 145. What is experiment ? In research, we manipulate some data, we change some variables that is called experiment,
  146. 146. What is experimental group? There are generally two types of groups – one on which you undertake experiment (experimental group) and one on which you dont do any experiment, just do observation.(control group) Example – if you have two plants, on one plant you pour fertilisers and on the other you dont put any fertilizer, then the former is experimental group and 2 nd is control group.
  147. 147. What is standard deviation? Deviation = difference here we find the difference of each value with mean and this will create standard deviation. Formula = square root of (sum of squares of difference of each element from mean)
  148. 148. Example : of standard deviation.. X dx^2 2 4 3 1 5 1 6 4 average = 16/4 = 4, dx = x-average = 2-4 = -2 average of dx^2 = variance = 10 / 4 = 2.5 standard deviation = square root of variance = sqrt(2.5) =1.58 answer
  149. 149. Steps in calculation of standard deviation ? 1. calculate average. For this total all the values of X and then divide it by n (in our example, we have divided 16/4, where 16 is total of all values and 4 is number of elements. 2. find dx (difference of x from mean) 3. square the dx to get dx^2 4 . find average of dx^2 this is called variance. 5. find square root of variance. This is called standard deviation.
  150. 150. What is covariance ? If there are two data series – let us say X andY, then we can find their relation, we need covariance. Co = together Variance = difference formula of covariance = total of dx*dy /number of elements
  151. 151. Example of covariance : X Y dx dy dx*dy 2 6 -2 2 -4 3 5 -1 1 -1 5 3 1 -1 -1 6 2 2 -2 -4 average of X=16/4 =4 , average of Y = 16/4=4 dx = difference of each element from X dy = difference of each element from Y total of dxdy=-10 covariance = -10/4 = -2.5 answer
  152. 152. What is correlation and regression Correlation just tells you that there is a relation between two variable. It doesnt tell you which is the dependent and which is independent variable. If you want to predict / forecast, you have to use regression. In regression, we have two variables – one dependent and one independent. Regression tells you about relation of these two variables. Based on regression, you can predict / forecast.
  153. 153. How to calculate correlation? There are many methods to calculate correlation, but the carl pearson's method is the most popular method. Formula of correlation = covariance / (product of standard deviation of X * standard deviation of Y) suppose covariance of X and Y is -4 and standard deviation of X is 2 and standard deviation of Y is also 2, then correlation = -4 / (2*2) = -1
  154. 154. What is the maximum and minimum value in correlation ? Maximum correlation = 1 (perfectly positive relation) minimum correlation = -1(perfectly negative relation – one falls other declines) no relation = 0
  155. 155. Example of correlation? X Y dx dy dx*dy 2 6 -2 2 -4 3 5 -1 1 -1 5 3 1 -1 -1 6 2 2 -2 -4 average of X=16/4 =4 , average of Y = 16/4=4 total of dxdy=-10 total of dx^2 = 10, standard deviation of x = sqrt(2.5) and standard deviation of Y = sqrt(2.5) covariance = -10/4 = -2.5 correlation = -2.5 / (sqrt (2.5) * sqrt (2.5)) = -1 answer
  156. 156. What is regression ? The basic model of linear regression (one dependent and one independent variable) is as under : y = a+ bx+e a = intercept b=slope e=error since error is random and moves in either direction, so we generally write as y=a+bx
  157. 157. What is regression? It is a simple tool to predict data. Regression assumes that there are at least two data sets, one is dependent on another. Example : if you say that demand is based on price, then we can have regression between price and demand. Price will be independent variable (called X), and demand will be dependent variable (called Y)
  158. 158. What is slope and intercept ? Simplest form of regression is linear regression (a straight line between dependent and independent variable). Here we need two things : slope and intercept. Slope is denoted by B and intercept is denoted by A. Formula of regression is : Y = A +BX 1. A is the point (value) of Y when X = 0 2. B denotes the rate of change in Y in response to change in X.
  159. 159. How to calculate slope? In the formula of y=a+bx, we use b to denote slope. It denotes change in y with reference to change in x. Slope can be calculated with the following formula = = covariance / (variance of x) once we calculate b, we can easily calculate a by putting in the formula y=a+bx thus we can get both a and b, then we can calculate yhat or Ỷ = a+bx (because a and b are known and with the help of x we can predict y)
  160. 160. Example of regression X Y dx dy dx*dy 2 6 -2 2 -4 3 5 -1 1 -1 5 3 1 -1 -1 6 2 2 -2 -4 average of X=16/4 =4 , average of Y = 16/4=4 variance of x = 2.5 covariance = -10/4 = -2.5 b=covariance/variance of x, and covariance =-2.5, variance of x=2.5 b= -2.5 / 2.5 = -1 now put it in formula to get a y=a+bx take y=4, x=4, b=-1 so 4 = a+(-1) 4 = or a = 8 thus a = 8, b = -1 so we can now predict y
  161. 161. What is coefficient of determination ? the percent of the variation that can be explained by the regression equation. the explained variation divided by the total variation the square of r (r denotes correlation) it is also called r squared we calculate it by taking difference of estimated y and average of y
  162. 162. Example of coefficient of determination Suppose estimated Y = 4 actual Y = 3 average of Y = 7 now total variation is 3-7 = -4 explained variation (determination) = 4-7 = -3 unexplained variation (error) = 3 -4 =- 1 here coefficient = -3/-4*100 =75%
  163. 163. What is coefficient of variation ? = Standard deviation / mean * 100 suppose standard deviation = 2 suppose mean = 4 =2/4 *100 = 50% coefficient of variation = 50%
  164. 164. What is skewness ? When the data are not normallydistributed, they are skewed. They are either towards left or towards right side. If the data are not skewed, it looks like a bell shaped data. But if it is skewed, it looks like a slope or like a see – saa. Formula = (mean – mode ) / standard deviation
  165. 165. What is bar chart ? It is a chart which represents thick lines (bars) to denote frequencies of X variables (on X axis) length of the bar should be equal to frequency it is similar to histogram (but there we use connected rectangles
  166. 166. What is ogive ? It is a chart. It indicates data on cumulative basis. Here you first calculate cumulative frequency and then find its %. Data may be expressed using a single line. You can display the total at any given time. The relative slopes from point to point will indicate greater or lesser increases. Ogive can be from left to right or from right to left
  167. 167. Example of ogive (here data are absolute in cumulative frequency – not in %)
  168. 168. What is class interval ? It denotes the width of class for example : 10 to 20 here class interval is 20-10 =10 class interval is calculated by following formula : (highest – least)/ number of classes desired
  169. 169. What is the difference between continuous and discrete data? Continuous data can take any value like 10.00073 it is writtes like : 10 to 20 (so here any value between 10 to 20 can come) but discrete data can take only certain numerical values like 3,4,5,6 etc.
  170. 170. Which of the following are linear equations? a) y = 4x − 5 b) 2x − 3y + 8 = 0 c) y = x² − 2x + 1 d) 3x + 1 = 0 e) y = 6x + x^3 f) y = 2 answer : out of these all those equation which result in straight line make a linear equation. C and E dont make any straight line. Rest all are linear equations.
  171. 171. Which of these ordered pairs solves the equation y = 5x − 6 ? A (1, −2) b (1, −1) c (2, 3) d (2, 4) answer : b & d
  172. 172. There are two lines : 2x+3y+5=0 and 4x-5y+2 = 0, find the point of their intersection? You may multiply the first equation by 2 and then subtract the second equation, you wil get : 11Y=-8 or Y = -8/11 putting this value, we get X = 1/11
  173. 173. Are these points are collinear ? A = 2,3 B = 4,1 C= -2,7 the points are collinear, if they are on one line. They are on one line if they satisfy the following formula : Xa(Yb-Yc) +Xb(Yc-Ya)+Xc(Ya-Yb) =0 =2(1-7)+4(7-3)-2(3-1) = -12+16-4 =0 so these points are collinear
  174. 174. Find the equation of the line which is parallel to 4x+7y+5=0, and passes through 5, -4. In case of parallel lines, the slope remains same thus only constant changes. Here constant is 5. 4(5)+7(-4)+k=0 k=8 4x+7y+8=0 answer
  175. 175. Are these points colinear? Make an equation from them? (3,1), (5,-5),(-1,13) 3(-5-13) +5(13-1)+-1(1--5) =-54+60-6 =0 these points are colinear Y-Y1/Y2-Y1 =X-X1/X2-X1 Y-1/-6 = X-3/2 2Y-2=-6x+18 Y+3X=10 answer
  176. 176. Find the equation of the line parallel to the line joining (7,5) and (2,9) and passing through (3,4) ? Y-Y1/Y2-Y1 =X-X1/X2-X1 Y-5/9-5 = X-7/2-7 -5Y+25=4X-28 =4X+5Y -53 =0 for parallel, constant = k 4(3) +5(4)+k=0 k = -32 so equation = 4x+5y-32=0 answer
  177. 177. What is a variable ? It can take different values. Generally variable is denoted by X,Y,Z, and constant is denoted by a,b,c variable can be of two types : 1. discrete – it takes only integer values example: number of houses 2.continuous – it can take any values example : height of a person
  178. 178. What is a function? It shows relation between two variable – one is dependent and one independent dependent variable is dependent on independent variable example : price = f(demand) here we want to show that price is dependent on demand, so price is a function of demand. Dependent variable = price, independent variable = demand
  179. 179. What are the various types of functions ? 1. linear function example : Y = A +bx here there is a straight line on a graph paper – and there is a direct linear relation between the two variables 2. polynomial function : there are multiple independent variables Y = a+bx1+cx2 .... 3. absolute value function - no impact of negative values
  180. 180. What are the measures of central tendency ? Mean = arithematic average (sum / number) Mode = the Variable which has highest frequency Median = the exact mid point of data. For example : 2,3,8,11,11 here Median = 8, mean = 7, Mode = 11
  181. 181. Formula of mean ? Add all the values and divide by number in the previous example : add all the values of 2,3,8,11,11 = 35 there are 5 values so divide 35 by 5 = 7 mean is denoted by Xbar
  182. 182. What is relation between mean, mode and median? mode=3median-2 mean in our example it should be : = 3*8 – 2 * 7 =10. but we have found 11.actually you will see, that the mode here should be 10 – as we discuss in later exercises
  183. 183. Calculate 1 st quartile from the following data ? X Freq. C.F 10 TO 20 4 4 20 TO 30 6 10 30 TO 40 8 18 40 TO 50 7 25 50 TO 60 5 30
  184. 184. SOLUTION FORMULA = L 1 (q1 – c) / f * (class interval) Q1 = n/4 =first quartile= 30/4 = 7.5 7.5 falls in 20 to 30 = 20 + (7.5 – 4) / 6 * (10) =20 + ((3.5/6) *10) =20 + 5.8 = 25.8 ANSWER
  185. 185. Calculate 31 st percentile from the following data ? X Freq. C.F 10 TO 20 4 4 20 TO 30 6 10 30 TO 40 8 18 40 TO 50 7 25 50 TO 60 5 30
  186. 186. Solution FORMULA = L 1 (31p – c) / f * (class interval) 31p= n/100 *31 = 30/100 *31 =9. 3 9.3 falls in 20 to 30 = 20 + (9.3 – 4) / 6 * (10) =20 + ((5.3/6) *10) =20 + 8.8 = 28.8 ANSWER
  187. 187. Can mean, mode and median be equal? Yes - in normal distribution, mean, mode and median are all equal. In normal distribution, we have 3 characteristics : 1. data are symmetrical 2. data are more in central values and less as we move apart 3. mean=mode=median most of statistical formula require normal distribution.
  188. 188. How to calculate median in bigger data : Formula :( N+1 )/ 2 n=number of data for example : 1,2,3,4,5,6,6,7 here we have 8 values , so (8+1)/2 = 4.5 so we should take mid value between 4 and 5, which is 4.5 answer
  189. 189. What types of data series are there ? There are many types of data series : individual data discrete series continuous series in continuous data series, there is no value which is not possible. (for example : 0 to 10, 10 to 20, 20 to 30) ....
  190. 190. What are the measures of dispersion? Dispersion = how the data is looking in comparison to mean. If data is wide apart from mean, there is high dispersion. If the data is just close to mean, there is very less dispersion. If data has more dispersion, there is less uniformity in the data. We have many tools to measure dispersion like range, variance etc.
  191. 191. Example of high and less dispersion of data : Low dispersion : 6,6,7,7,7,7,8,8,9, high dispersion : 1,4, 8, 19,20,50,60,80,100 you can see, the first data set has far more consistency and dispersion is less. Tools to measure dispersion are : range, standard deviation, variance, mean deviation etc. Range = highest – least value. In the first case range = 9-6 = 3, in 2 nd case range = 100-1 = 99
  192. 192. What is standard deviation? Here we find the difference between each value and mean. Then we square the difference and find the average. This is called variance. Square root of variance is called standard deviation. This gives us an estimate of dispersion of data.
  193. 193. Example of standard deviation? X has 5 values : 1,2,3,4,5 its total is 10. average = 15/5 = 3 now we take difference of each value : (1-3) = -2, (2-3)=-1, (3-3) = 0... so we get : -2,-1,0,1,2, now square them = 4,1,0,1,4 total =10 now find the mean=10/5 = 2 (this is variance) square root of 2 = 1.4 is the standard deviation.
  194. 194. Example of intercept and slope ? If X change by 10% but Y changes by 20%, so slope = 20/10 = 2 if it is written that X,Y points are : (0,2), (2,4),(4,6),(6,8) ... here you can see that there is a linear relation between X and Y. (first digit is X and second digit is Y). Intercept is 2, because when X is 0, Y is 2.
  195. 195. Find slope in the following example? X Y 2 11 4 8 6 3 8 1
  196. 196. Solution Slope (b ) = covariance / variance of X so first we shall calculate covariance
  197. 197. Solution X Y dx dy dxdy dx^2 2 11 3 -5.25 -15.75 9 4 8 1 -2.25 - 2.25 1 6 3 -1 2.75 -2.25 1 8 1 -3 4.75 -14.25 9 covariance = (-34.5 / 4) = 8.62 variance of x = 20 / 4 = 5 slope (b) = 8.62/5 = 1.72 answer
  198. 198. What are the types of data ? 1. primary (which you collect yourself) 2. secondary (which is already collected for some other purpose, but you can also use it).
  199. 199. What are the various types of statistical analysis? 1. descriptive statistics : here you collect data and present it (for example data on market share) 2. inductive statistics : here you undertake statistical inferences and estimate for future 3. statistical decision theory : here you have to take decision about a situation based on statistics
  200. 200. What are the basic tools for statistical analysis ? 1. be clear about problem 2. formulate hypothesis 3. set significance level (how much accuracy do you want) 4. set sampling frame, research design & collect data 5. analyse data and draw inferences
  201. 201. What is hypothesis ? What do you want to test. We frame 2 hypothesis at least. One of them is null hypothesis and one is alternate thesis. Based on literature review & our own experiences, we frame some understanding on the subject. We have to frame null hypothesis which is opposite of this idea. Then we have to frame alternate hypothesis. We test out null hypothesis.
  202. 202. What is type I and type II error ? If we reject a null hypothesis, which is actually true, we are having type I error if we accept a null hypothesis which is actually false – we are having type II error. We have to set standards for both these errors. If you become liberal for type I error, then type II error will increase and vice versa.
  203. 203. What is alpha ά ? Type I error is called alpha
  204. 204. How do we test alpha ? We calculate P value. If P value is less than alpha, we reject null hypothesis if P value is more than alpha then we cant reject null hypothesis
  205. 205. What is p value ? It is actual calculation about what is the possibilitity of error. It is calculated to be compared with alpha. Alpha is determined in advance, but P value is actual observation.
  206. 206. How does statistics & econometrics help you in business decisions? You can test your decisions using data. You can also build models. There are various types of model : 1. physical, 2. geographic 3. schematic 4. analog 5. mathematical / statistical / econometrics based statistics and econometrics can help you in building the last types of models (5 th type)
  207. 207. What types of statistical analysis are possible ? 1. univariate (there is a single set of data) 2. bivariate (there are two sets of data) 3. multivariate (there are many sets of data)
  208. 208. What are univariate tools? Mean, mode, median, time series analysis, moving average analysis etc.
  209. 209. What are bivariate tools ? Correlation, regression, etc.
  210. 210. What are multivariate tools ? There are many like : conjoint analysis, multivariate regression etc. Here we have many variables : example : demand is dependent on
  211. 211. An biased die is tossed.Find the probability of getting a multiple of 3? The possible options are : 1 to 6. there are only 2 multiples of 3 : 3,6 so probability is (number of favourable outcomes ) / (total number of possibilities) = 2/6 = 1/3 answer
  212. 212. In a simultaneous throw of a pair of dice,find the probability of getting a total more than 7? We can have 36 possibilities (6*6) however, we need only those cases where the total is 8 or more. These are : (6,2),(6,3),(6,4),(6,5),(6,6),(5,3),(5,4),(5,5),(5,6),(4,4),(4,5),(4,6),(3,5),(3,6),(2,6) =15 answer = 15/36 = 5/12 answer
  213. 213. A bag contains 6 white and 4 black balls .Two balls are drawn at random .Find the probability that they are of the same colour? Both are white : 6/10*5/9 both are black = 4/10*3/9 add them : =42/90 or 7/15 or : 6c2/10C2*1/2 + 4c2/10c2 =21/45 = 7/15 answer
  214. 214. Two dice are thrown together.What is the probability that the sum of the number on the two faces is divisible by 4 or 6? The possibilities are : (1,3)(1,5) (2,2) (2,4),(2,6),(3,1),(3,3),(3,5),(4,2),(4,4),(5,1),(5,3),(6,2),(6,6) thus we are able to get 14 out of 36. so answer = 7/18 answer
  215. 215. Two cards are drawn at random from a pack of 52 cards What is the probability that either both are black or both are queens? Both are black = 26/52 * 25/51=25/102 both are queens : 4/52 * 3/51=3/663 both are black queens : 2/52*1/51 = 1/1326 now add them : (25/102 + 3/663 – 1/1326) =(325+6-1)/1326 =330/1326 or .25 answer
  216. 216. Two dices are tossed the probability that the total score is a prime number? Prime numbers are : 1,2,3,5,7,11 totals are : (1,2),(1,1),(1,4),(1,6),(2,1),(2,3),(2,5),(3,2),(3,4),(4,1),(4,3),(5,2),(5,6),6,1),(6,5) =15/36 answer
  217. 217. Two dice are thrown simultaneously .what is the probability of getting two numbers whose product is even? If any one of the two numbers is an even number, the product will be even number. Thus we should pick up all those cases when both the numbers are odd numbers : (1,1),(1,3),(1,5),(3,1),(3,3),(3,5),(5,1),(5,3) (5,5) thus there are only 9 such cases. Remove them from 36, we get : 27 cases answer : 27/36 answer
  218. 218. In a lottery ,there are 10 prizes and 25 blanks.A lottery is drawn at random. what is the probability of getting a prize ? 10/(10+25) =10/35 or 2/7 answer
  219. 219. In a class ,30 % of the students offered English,20 % offered Hindi and 10 %offered Both.If a student is offered at random, what is the probability that he has offered English or Hindi? 30+20-10 = 40% or .4 answer
  220. 220. Two cards are drawn from a pack of 52 cards .What is the probability that either both are Red or both are Kings? Both are red ½ * 25/51 both are king = 4/52 + 3/51 now add both these answers =55/221
  221. 221. one card is drawn at random from a pack of 52 cards.What is the probability that the card drawn is a face card? Face cards are : Jack, queen, king total = 12 12/52 answer
  222. 222. A man and his wife appear in an interview for two vacancies in the same post.The probability of husband's selection is 1/7 and the probabililty of wife's selection is 1/5.What is the probabililty that only one of them is selected? Husband + not wife =1/7 * 4/5 = 4/35 wife + not husband =1/5 * 6/7 = 6/35 add = 10/35 answer
  223. 223. From a pack of 52 cards,one card is drawn at random.What is the probability that the card is a 10 or a spade? 4/52 + 13/52 – 1/52 =16/52 answer
  224. 224. A bag contains 4 white balls ,5 red and 6 blue balls .Three balls are drawn at random from the bag.What is the probability that all of them are red ? 5/15*4/14*3/13 or 5c2/15c2 = =2/91
  225. 225. A box contains 10 block and 10 white balls.What is the probability of drawing two balls of the same colour? Both are black : 10/20 * 9/19 =9/38 +both are white : 10/20 * 9/19 =9/38 or black : 10c2 / 20c2 +white : 10c2 / 20c 2 =90/190
  226. 226. A box contains 20 electricbulbs ,out of which 4 are defective, two bulbs are chosen at random from this box.What is the probability that at least one of these is defective ? In such questions (at least one type), it is better to reverse the question, solve it and deduct the answer from 1. So here we shall first calculate the probability of getting no defective bulb. Let us assume that no bulb is defective : 16/20 * 15/19 = 12/19 at least one is defective = 1 -12/19 = 7/19 answer
  227. 227. Two cards are drawn together from apack of 52 cards.What is the probability that one is a spade and one is a heart ? First is spade and 2 nd heart : 13/52 * 13/51 = 13/204 First is heart and 2 nd spade : 13/52 * 13/51 = 13/204 add them : 13/102 answer
  228. 228. The probability that a card drawn from a pack of 52 cards will be a diamond or a king? 13/52 + 4/52 – 1/52 =16/52
  229. 229. What is hypothesis ? What you think or what you want to check out or what you want to study is called hypothesis. We prepare two types of hypothesis : 1 null hypothesis (just opposite of what we think or what we are testing out) 2. alternate hypothesis (what we want to check out). We study and check null hypothesis only.
  230. 230. What is systematic sampling ? If you pick up first unit by random sampling thereafter you pick up each value systematically it is called systematic sampling. Suppose you pick up first unit randomly, this is 12, no you take up every 4 th element, it is systematic sampling, you take up 12, 16, 20, 24, 28 ... so on thus this type of sampling saves time and creates the virtues of random sampling also.
  231. 231. What are continuous and discrete distributions ? Continuous distributions are : 1. normal 2. exponential Discrete distributions are : 1. pascal 2.poisson 3. binomial
  232. 232. What is theoretical distribution ? If you pick up a sample it will not give you exactly the same value as population. If you pick up a large number of samples out of population, and plot the values of these samples, you will get what we call as theoretical distribution. If the sample size is large, the theoretical distribution will approximate the real population.
  233. 233. What is normal distribution ? It is also called the Gaussian distribution. It is defined by two parameters mean ("average" m) and standard deviation (σ). A theoretical frequency distribution for a set of variable data, usually represented by a bell-shaped curve symmetrical about the mean. MEAN=MEDIAN=MODE & data symmetrical bell shaped
  234. 234. What is Z, if mean=100 and standard deviation (σ) 6 find P(X<106) Step 1: For a given value X=106 formula of Z = value – mean / standard deviation Z = (106-100)/6 = 1 Step 2: Find the value of 1 in Z table Z = 1 = 0.3413 Step 3: Here the X value is greater than mean (bell shaped curve is half = .5 in both side equally, .3413 is in right side of this curve, but left side is also included, so .5 of left side) P(X) = 0.5 + 0.3413 = 0.8413
  235. 235. What is probability density function? PDF of a continuous random variable is a function which can be integrated to obtain the probability that the random variable takes a value in a given interval. PDF is used to find the point of Normal Distribution curve. Continuous Probability Density Function of the Normal Distribution is called the Gaussian Function.
  236. 236. Formula of PDF ? ((1/(σsqrt(2π)))*e^(x-m)^2 / (2σ^2)
  237. 237. What is type I error ? When you reject null hypothesis when it is actually correct, it is called type I error it is also called alpha
  238. 238. What is type II error ? When you accept null hypothesis when it is actually incorrect, it is called type II error
  239. 239. What is type III error ? Rejecting a null hypothesis for wrong reason is called type III error it is rarely used.
  240. 240. What is binomial distribution ? The Binomial Distribution is one of the discrete probability distribution. It is used when there are exactly two mutually exclusive outcomes of a trial. These outcomes are appropriately labeled Success and Failure. The Binomial Distribution is used to obtain the probability of observing r successes in n trials, with the probability of success on a single trial denoted by p.
  241. 241. Example of binomial distribution Take up the case of coins. What is the probability of getting 3 heads on 4 trials (coin has only two outcomes = head, tail) (what is probability of observing 3 successes in 4 trials, with the probability of success on a single trial denoted by p = .5) formula : P(X = r) = nCr p^r (1-p)^(n-r). C = combination
  242. 242. Solution .... N = 4, r = 3 p = .5 formula of combination = N! / ((N-r)! * r!) C = 4! / ((4-3)! * 3!) =24 / (1*6) = 4 4! = 4*3*2*1 = 24 Ncr = 4, p^r = (.5)^3 = .125 (1-p)^(n-r). = (1-.5)^(4-3) = .5 solution = 4*.125*.5 =.25 answer
  243. 243. What is poisson distribution? It is one of the discrete probability distribution. This distribution is used for calculating the possibilities for an event with the given average rate of value(λ). A poisson random variable(x) refers to the number of success in a poisson experiment.
  244. 244. Formula of poisson distribution f(x) = ((e^-λ)(λ^x)) / x! where, λ is an average rate of value. x is a poisson random variable. e is the base of logarithm(e=2.718)
  245. 245. in an office 2 customers arrived today (take it as average). Calculate the possibilities for exactly 3 customers to be arrived on tomorrow. Here λ (lembda) (mean arrival) = 2, & x (value to calculate) = 3 Step1: Find e^-λ. where, λ=2 and e=2.71828 e-λ = (2.718)^-2 = 0.135. Step2: Find λ^x. where, λ=2 and x=3. λx = 2^3 = 8. Step3: Find f(x). f(x) = e-λλx / x! f(3) = (0.135)(8) / 3! = 0.18.
  246. 246. How to carry out hypothesis testing ? First of all fix level of significance and alpha. If you keep alpha of 5% it means that you will consider level of significance of 95%. That means that there would be 5% chance of error (which you are willing to tolerate). When we calculate Z (in normal distribution), we try to see whether it will fall in our level of significance or not and what is the p value. If p value is less than our acceptable error, we reject the null hypothesis.
  247. 247. What is alpha in statistics ? It is the error that you are willing to tolerate. Alpha is also denoted by type I error
  248. 248. What is the variance of binomial distribution ? N * p * q n = number of units p = probability q = (1-p)
  249. 249. Calculate coefficient of concurrent deviation ? (a type of correlation) X Y 4 8 5 4 6 2 8 1
  250. 250. Solution Formula = + / - sqrt ( + / - ((2c – m) / m) ) c = number of positive signs as concurrent deviations m = totle number of pairs
  251. 251. Solution X Y dx dy dxdy 4 8 5 4 + - - 6 2 + - - 8 1 + - - here m = 3, c = 0 (C is number of + signs in dxdy) = -sqrt (-(0-3) / 3) =- 1 so there is correlation of -1. answer
  252. 252. What is finite population multiplier ? When we are taking samples from finite population without replacement, then the properties of normal distribution get distorted, because the probability of 2 nd item depends on 1 st item and so on. Therefore we have to use finite population multiplier with all our formula : sqrt( (N-n) / N-1)) N=population size; n=size of sample
  253. 253. What is standard error mean ? Error = fluctuations standard deviation of mean is also called standard error of mean its formula = standard deviation / sqrt(n) n = number of items in sample
  254. 254. Formula for standard error mean ? Standard deviation of mean / (sqrt n) n = size of sample if population is finite, use finite population multiplier
  255. 255. Why do we do sampling ? When we are collecting any data – there are two options – 1. contact each unit and collect data from this – called census 2. pick up only a few and on the basis of their response try to infer the response of the entire population – called sampling sampling saves time, resources, but there is little bit possibility of error – which can be minimised by systematic research process.
  256. 256. What are the main types of sampling ? Probability or non probability sampling probability sampling = where each element has equal probability of selection non-probability sampling = where due to some or other reason, there is inqual probability of selectio of item example : in a fair contest, every one has equal chance / lottery – these are examples of probability sampling non-probability sampling - reservation / selection of your own friends / nepotism
  257. 257. What are probability sampling methods ? 1. simple random sampling (just like lottery) 2. systematic sampling (select first item randomly thereafter pick up each item on fixed gap) 3. stratified sampling (divide population in different strata / group / classification and then pick up randomly some persons from each strata) 4. cluster sampling (here pick up one or a few cluster out of a large number of clusters)
  258. 258. When should we use which types of sampling ? It depends on our purpose, population, type of universe and the situation. Suppose, you are able to get clusters, which have elements representing the entire population, you may go for cluster sampling. If you want to really use a good method, have random sampling in that method.
  259. 259. What are the methods of nonprobability sampling ? Here we undertake sampling on the basis of some criteria / convenience : 1. convenience sampling (example : you pick up people from your friends / relations ) 2. quota sampling (example our reservation system) 3. judgemental sampling (example : select sample on some criteria)
  260. 260. What is multi-stage sampling ? When we undertake sampling in different stages, it is called multistage sampling. Example : suppose you want to study rural development in the world - First you pick up nation to study, then you pick up state, then you pick up district, then you pick up village and finally the sample
  261. 261. What is cluster sampling ? Suppose that population is so divided that there are many clusters and each cluster is a mini representation of the entire population, then we can go for cluster sampling.
  262. 262. What is sampling distribution ? Sampling distribution is the distribution of all the means of the various samples that are possible from a population. Example : suppose our population is 1,2,3,4,5,6 and we are picking up samples of 3 units out of this. Population mean = 3.5 sample means could be : (1,2,3) = 2, (2,3,4) = 3, (3,4,5) = 4 and so on. So we can have sample means like 2,3,4,5, etc. If we plot these sample means, it will give us distribution which is similar to the population itself. Larger the sample, more accurate will be the estimation.
  263. 263. What is sampling distribution of mean? The distribution (on graph paper) of the sample means is called sampling distribution of mean
  264. 264. What is central limit theory and Z ? Normal distribution is when we have mean=median=mode and all these are in the centre of data (data is bell shaped) Z = (sample mean - population mean ) / (standard deviation * sqrt(n)) based on Z we can calculate probability of a value taking some value on the graph.
  265. 265. Example of central limit theory Population mean = 100 variance = 36 sample size = 25 what is the Z that sample has mean of 90 ? (90 – 100) / (6 * (5) ) = 10 / 30 = - .33 thus Z = - .33 from this we can make inference here we should use t distribution instead of z
  266. 266. What is the difference between Z and T ? Z denotes the limits as per normal distribution t is also for Z. But when the sample size is less than 30, we have to use t instead of z as the sample size increases, t approaches z if the sample size is more than 30, we have to use z instead of t
  267. 267. What is the z for proportion ? When we are taking population data in % or in proportion, we use the followng formula : = (sqrt ((p * q ) / n ))) p = possibility / probability / porportion which is desired q = 1-p n = sample size
  268. 268. What is one tailed or two tailed test ? When we are comparing both the sides (increase or decrease) it is two tailed test when we are comparing only one side, it is one tailed test.
  269. 269. What is the procedure in hypothesis testing ? 1. frame two hypothesis : null and alternate 2. fix level of significance 3. define critical region 4. compare actual values with desired values 5 conclude
  270. 270. Govt data say that 65% of Indian students rent out their bikes. In a sample of 200 only 80 claimed to have rented out their bikes. Prepare null hypothesis and test it. Here we are compare data with 65% so Ho (null hypothesis) = .65 H1 = < .65 testing = (.4 - .65) / sqrt(200)
  271. 271. What is chi square ? Compare actual and expected values x y a 11 12 23 b 9 8 17 totl: 20 20 40 in order to calculate expected value we use folllowing formula : (row total * column total) / grand total =for first value of 11, we have : (23*20 ) / 40 = 11.5
  272. 272. Table of expected values .... x y a 11.5 11.5 23 b 8.5 8.5 17 totl: 20 20 40
  273. 273. Calculate difference between observed and expected values 1 st = (11 -11.5) =- .5 2 nd (12-11.5) = .5 3 rd (9 – 8.5) = .5 4 th (8-8.5) = -.5
  274. 274. Find square of the difference 1 st = .25 2 nd ..25 3 rd .25 4 th .25
  275. 275. Divide this value by expected value 1 st = .25 /11.5 =.02 2 nd ..25 /11.5= .02 3 rd .25/8.5=.029 4 th .25 / 8.5 =.029 total these values = .1 this is chi=square value at 5% significance level, the standard chi square value is 3.84 which is more than our value, so we cant reject null hypothesis and we conclude that both the groups are similar.
  276. 276. Question : 600 rich and 400 poor students take a test. Use chi square test to find whether their marks are significantly different or not ? H L R 460 140 600 P 240 160 400 TOT. 700 300 1000
  277. 277. Start = frame hypothesis and significance level Null hypothesis = both groups are similar alternate hypothesis = both groups are not similar significance level:- ά =5% (there are 5% chances that an incorrect hypothesis is rejected)
  278. 278. CHI SQUARE – STEP 1. find expected values for each of these values The formula is =: (row total * column total)/grand total expected values are as under : H L R 420 180 600 P 280 120 400 TOT. 700 300 1000
  279. 279. Step 2 : find difference between observed and expected values and square them up. Formul a = (o – e)^2 H L R 1600 1600 P 1600 1600
  280. 280. Step 3 : divide (o-e)^2 by expected values Formula (o-e)^2 / expected value H L R 1600/420 1600/180 P 1600/280 1600/120
  281. 281. Step 4 : add them all : this is chi square value = 3.81 + 8.9 + 5.71 + 13.33 total = 31.75
  282. 282. Step 5 : compare this value with standard value. Standard value can be calculated by a formula or you can also see chi-square table to find the standard value. Table has two dimensions : one dimension shows degree of freedom and one dimension denotes level of significance degree of freedom = (row -1) (column -1) =(2-1*(2-1) = 1 at 1 degree of freedom and ά =5% we find the chi square table value is 3.84. so compare the value with 3.84
  283. 283. Step 6 : derive conclusion If the calculated value of chi square is more than the table value, then reject the null hypothesis. If the calculated value of chi square is less than the table value, then accept the null hypothesis. In this case, our calculated value of chi square is 31.75, which is higher than table value of chi square (3.84) so we can reject the null hypothesis
  284. 284. Conclusion We can conclude that null hypothesis is rejected and there seems to be significant difference between the two groups.
  285. 285. Graphical presentation 3.84 31.75 Rejection zone Acceptance zone 0
  286. 286. 10% of the tools produced turn out to be defective. What is the probability that out of 10 tools chosen randomly, exactly 2 are defective ? Here we can use binomial distribution or poisson distribution to solve this problem. Let us solve using binomial distribution : formula : Ncr * p ^r * q^ (n-r) n = 10, r = 2, p = 10% or .1 q = (1-p) = (1-.1) = .9 c = combination formula
  287. 287. Solution beginning 10c2 * (.1) ^2 * (.9)^ (10-2)
  288. 288. Step 1 – solve combination Formula = N ! / ((N-r)! * r!) 10 c 2 = 10! / (10-2)! * 2! 10! = 10*9*8*7*6*5*4*3*2*1 2!=2*1 = (10 * 9) /(2*1) =45
  289. 289. Step 2 : solve remaining portion (.1) ^2 * (.9)^ (10-2) =.01 * .43 =.0043
  290. 290. Step 3 : multiply both 45 * .0043 =.19 it means that there is 19% chance that exactly 2 tools are defective.
  291. 291. Solve this question using poisson distribution ... Formula : = (e^ -lemda * m ^ x) / x! e =2.71828 m = probability = .1 or 10% or our sample = 10*.1 = 1 x = our question here X is 2 ( because we want to know whether 2 are defective or not)
  292. 292. Step 1 apply formula : first part e^ -lemda = 2.71828 ^ (-1) =.37
  293. 293. Step 2 : solve second part of formula m ^ x = 1^2 = 1
  294. 294. Step 3 solve 3 rd part of formula X! = 2! = 2
  295. 295. Step 4 : combine all these calculations (.37 * 1) / 2 =.19 here we can see that 19% probability is there that there are 2 tools which are defective. Answer
  296. 296. What is the probable error of coefficient of correlation for r =.6 and N = 64 also set limits ? PE = .6745 (1-r^2) / sqrt(n) = .054 limits : .6+.054 and .6-.054 answer
  297. 297. Download these ... http://www.scribd.com/doc/14629844/Statistics http://www.scribd.com/doc/7131975/BUSINESS-STATISTICS http://www.scribd.com/doc/7378714/13-August-Statistical-Analysis http://www.scribd.com/doc/6584095/5-August-Statistical-Analysis http://www.scribd.com/doc/23393630/STATISTICS-FOR-MANAGEMENT-15-OCTOBER http://www.scribd.com/doc/7378715/13-August-Statistics-Regression http://www.scribd.com/doc/23393630/STATISTICS-FOR-MANAGEMENT-15-OCTOBER
  298. 298. Download these... http://www.scribd.com/doc/6681092/4-August-Statistical-Analysis http://www.scribd.com/doc/23393621/STATISTICS-FOR-BUSINESS-MANAGEMENT-11-OCTOBER http://www.scribd.com/doc/26361286/Theoretical-Distribution-Statistics http://www.scribd.com/doc/25295635/Basic-Statistics-for-Non-Commerce-Students http://www.scribd.com/doc/7131920/BUSINESS-STATISTICS-25-SEPT http://www.scribd.com/doc/14705384/21-June-Research
  299. 299. Download more resources http://www.esnips.com/web/onlinespeeches
  300. 300. http://www.esnips.com/doc/a4161e7a-6859-4583-b3b9-fc8883819ca2/REASONING-QUIZ http://www.esnips.com/doc/292f3066-dfff-4507-9ef8-f07991c83ca6/reasoning-made-simple http://www.esnips.com/doc/61098141-2b35-4dd2-8c71-a26c481e4406/25-JULY-REASONING--DI http://www.esnips.com/doc/6335a093-fa8b-4829-91b7-99614c3b36b4/16-July-Reasoning-I http://www.esnips.com/doc/17998a15-f26a-4de7-b8a7-2b964fcde13f/Logic-and-Reasoning http://www.esnips.com/doc/816ec880-c1d5-436c-b5f3-d05525416f0e/1-AUGUST-Reasoning--DI http://www.esnips.com/doc/026362a7-efe4-42ef-afe2-1cf7c672d11f/17-reasoning-II http://www.esnips.com/doc/ebceaf38-3af0-4b62-bd4a-1a9c1b002d47/reasoning-15-may
  301. 301. Download links .... http://www.esnips.com/doc/1ca22536-6475-4fa4-bb7f-5f77757301a1/30-may-reasoning http://www.esnips.com/doc/3e7518ef-dc30-4309-af0f-8dd63ec028db/16-July-Reasoning-I http://www.esnips.com/doc/f5f8334d-7642-4154-9bfa-96eee4527798/11-July-Reasoning-II http://www.esnips.com/doc/caf3db9b-7739-4f37-86a3-1fd675702b64/11-July-Reasoning http://www.esnips.com/doc/ebceaf38-3af0-4b62-bd4a-1a9c1b002d47/reasoning-15-may http://www.esnips.com/doc/3aab1a3a-43f3-49a8-b819-af47124d2382/19-JUNE-REASONING--DI
  302. 302. Download links... http://www.esnips.com/doc/8f6897b7-0bab-4e53-858a-697bf192fd7b/reasoning-and-DI-12-September http://www.esnips.com/doc/23aac52e-3ecf-4d89-8b69-57d3fed77fd2/4-june-reasoning http://www.esnips.com/doc/48840095-2865-4eb1-8810-e818aacd1e25/4-JULY-Reasoning--DI http://www.esnips.com/doc/67a0ff61-c31f-43a0-9941-49a17b15c8c3/3-JULY--Reasoning--DI http://www.esnips.com/doc/f6d85f1b-881b-4e93-8341-8c779e98868d/1-AUGUST-Reasoning--DI
  303. 303. Download links... http://www.slideshare.net/tkjainbkn/reasoning http://www.slideshare.net/tkjainbkn/reasoning-2814907 http://www.scribd.com/doc/28677498/
  304. 304. Download links.... http://www.scribd.com/doc/28617263/Syllabus-of-Mat-Rmat-Cet-2010 http://www.scribd.com/doc/28617157/Vedic-Mathematics-for-All http://www.scribd.com/doc/28616660/English-Improvement-for-Competitive-Examinations-and-Aptitude-Tests http://www.scribd.com/doc/28531795/Mock-Paper-Cat-Rmat-Mat-Sbi-Bank-Po-Aptitude-Tests http://www.scribd.com/doc/23610071/Reasoning http://www.scribd.com/doc/14675645/10-July-Reasoning-II http://www.scribd.com/doc/23393680/REASONING-AFTERSCHOOOL http://www.scribd.com/doc/6583303/Reasoning-Afterschoool
  305. 305. Download links for material in english http://www.authorstream.com/presentation/tkjainbkn-146799-english-error-spotting-sentence-im-law-cat-gmat-mba-management-business-research-cfp-cfa-frm-cpa-ca-cs-icwa-india-rajasthan-improvement-entertainment-ppt-powerpoint/ http://www.docstoc.com/docs/3921499/ENGLISH-%E2%80%93-ERROR-SPOTTING-AND-SENTENCE-IMPROVEMENT http://www.slideshare.net/tkjainbkn/english-error-spotting-and-sentence-improvement-presentation http://www.scribd.com/doc/19641980/Error-Spotting http://www.scribd.com/doc/11629005/English-Error-Spotting-and-Sentence-Improvement http://www.scribd.com/doc/14660441/English-Afterschoool-23-May http://www.scribd.com/doc/6583519/English-Afterschoool-21-May http://www.scribd.com/doc/6583520/English-Afterschoool-21-May-2
  306. 306. Download links for material on English http://www.scribd.com/doc/6583315/English-Improvement-Afterschoool http://www.scribd.com/doc/6583518/English-20-May-Afterschoool http://www.scribd.com/doc/28531795/Mock-Paper-Cat-Rmat-Mat-Sbi-Bank-Po-Aptitude-Tests
  307. 307. Links http://www.scribd.com/doc/14647398/English-Improvement-and-Word-Power http://www.scribd.com/doc/14660508/General-Knowledge-24-May-3 http://www.scribd.com/doc/19628963/Mathematics http://www.scribd.com/doc/19492878/Direct-and-Indirect-Speech-in-English
  308. 308. Be Quicker faster more accurate
  309. 309. ADDITIONAL LINKS http://www.scribd.com/doc/11692763/Advanced-Mathematics-for-GMAT-CAT-MAT http://www.scribd.com/doc/23407929/Advanced-Mathematics-for-GMAT-CAT-MAT http://www.scribd.com/doc/11625819/Basic-Mathematics-for-Cat-GMAT-Mat http://www.scribd.com/doc/23407954/mathematics-for-OPENMAT-MAT-CAT-GMAT-25-april http://www.scribd.com/doc/23407934/Advanced-Mathematics-for-GMAT-CAT-MAT2 http://www.scribd.com/doc/6583520/English-Afterschoool-21-May-2 http://www.scribd.com/doc/23300964/Advanced-Mathematics-for-GMAT-CAT-MAT2 http://www.scribd.com/doc/23407945/MATHEMATICS-FOR-ATMA-CAT-MAT-GMAT-BANK-PO-GRE
  310. 310. http://www.scribd.com/doc/6583303/Reasoning-Afterschoool
  311. 311. http://www.scribd.com/doc/23393680/REASONING-AFTERSCHOOOL http://www.scribd.com/doc/23393719/reasonning-44 http://www.scribd.com/doc/6583347/DI-and-Reasoning http://www.scribd.com/doc/23407636/10-July-reasoning http://www.scribd.com/doc/6583273/Reasoning-Quiz http://www.scribd.com/doc/23393476/10-July-reasoning http://www.scribd.com/doc/23393716/REASONING-QUIZ http://www.scribd.com/doc/14705025/17-Reasoning-II http://www.scribd.com/doc/23393478/10-July-reasoning-II
  312. 312. Free download useful material ... http://www.scribd.com/doc/23393316/general-knowledge http://www.scribd.com/doc/23609752/Group-Discussion-Afterschoool http://www.scribd.com/doc/6583547/General-Knowledge-24-May
  313. 313. THANKS.... GIVE YOUR SUGGESTIONS AND JOIN AFTERSCHOOOL NETWORK / START AFTERSCHOOOL SOCIAL ENTREPRENEURSHIP NETWORK IN YOUR CITY / CONDUCT WORKSHOP ON SOCIAL ENTREPRENEURSHIP IN YOUR COLLEGE / SCHOOL / CITY [email_address] JOIN OUR NETWORK TO PROMOTE SOCIAL ENTREPRENEURSHIP

×