- 1. CHAPTER 1 Descriptive Statistics Objectives: 1. To study the basic introductory concept of statistics, including the branches of statistics, the basic terms of statistics, and types of variables. 2. To be able to use graphical and numerical methods to describe a data set. 3. To be able to find mean, median, mode and standard deviation for grouped data and ungrouped data. www.itarosley@blogspot.com 1
- 2. CHAPTER 1 Descriptive Statistics Descriptive Statistics Ungrouped Data Group Data www.itarosley@blogspot.com 2
- 3. CHAPTER 1 Descriptive Statistics Ungrouped Data Measurement of Central Tendency Mode Median Measurement of Dispersion Mean www.itarosley@blogspot.com Variance Std Deviation 3
- 4. CHAPTER 1 Descriptive Statistics Grouped Data Measurement of Central Tendency Mode Median Measurement of Dispersion Mean www.itarosley@blogspot.com Variance Std Deviation 4
- 5. CHAPTER 1 Descriptive Statistics Definition of basic terms a) Population consists of all items or elements of interest for a particular decision or investigation. E.g.: All married staff over the age of 25 in UTHM. b) Samples is a certain number of elements that have been chosen from a population. Sample is a subset of population. E.g.: a list of married staffs over the age 25 in the Registrar’s Office would be a sample from the population of all married staffs over the age of 25 in the UTHM. c) Random sample is a sample drawn in such a way that each element of the population has a chance of being selected. d) Simple random sample implies that any particular sample of a specified sample size has the same chance of being selected as any other sample. www.itarosley@blogspot.com 5
- 6. CHAPTER 1 Descriptive Statistics e) Element / number is a specific subject or individual about which the information is collected. f) Variable is a characteristic of the individual within the sample or population g) Observation / measurement is the value of a variable for an element. h) Data set is a collection of values of one or more variables. i) Ungrouped data set contains information of each number of a sample or population. j) Grouped data set is a collection of data which are grouped in classes. k) Raw data is data recorded in the sequence in which they are collected and before they are processed or ranked. www.itarosley@blogspot.com 6
- 7. CHAPTER 1 Descriptive Statistics l) Population parameter is a descriptive measure computed from a population data. m) Sample statistic is a descriptive measure computed from a sample data. n) Outliers / Extreme Values are values that are very small or very large relative to the majority of the values in a data set. www.itarosley@blogspot.com 7
- 8. CHAPTER 1 Descriptive Statistics Example 1. The following table gives the number of sales of A4 paper in 8 shops in Melaka. Shop Number of A4 Paper (in reams) 1 2 3 4 5 6 7 8 2000 2500 3000 5000 7000 5000 4000 5500 Elements or members www.itarosley@blogspot.com Variable Observations or measurements 8
- 9. CHAPTER 1 Descriptive Statistics Measures of central tendency are statistical measures which describe the position of a distribution. They are also called statistics of location, and are the complement of statistics of dispersion, which provide information concerning the variance or distribution of observations. In the univariate context, the mean, median and mode are the most commonly used measures of central tendency. www.itarosley@blogspot.com 9
- 10. CHAPTER 1 Descriptive Statistics Mean - The average of data values Median - Middle value in ranked list - Data must be arranged in increasing or decreasing order. -Ungrouped data and grouped data Mode - Value that occur most frequency www.itarosley@blogspot.com 10
- 11. CHAPTER 1 Descriptive Statistics Sample vs. Population Sample Population www.itarosley@blogspot.com 11
- 13. CHAPTER 1 Descriptive Statistics Median for Ungrouped Data x(n Median , M xn / 2 when n is odd , 1) / 2 , x(n / 2) 1 when n is even 2 www.itarosley@blogspot.com 13
- 14. CHAPTER 1 Descriptive Statistics Mode for Ungrouped Data The frequency of each value in the data set. •If no value occurs more than once, then the data set has no mode. •Otherwise, any value that occurs with the greatest frequency is a mode of the data set. www.itarosley@blogspot.com 14
- 15. CHAPTER 1 Descriptive Statistics Exercise 1. Find the mean for the price of pen (in RM) below: 2.00 2.50 3.00 3.50 2.50 2. A sample of six students in UTHM is selected and their height is measured, resulting in the following data: 150.2 cm 1.592 m 149.4 cm 152.7 cm 1.533 m 1.510 m Find the sample mean. 3. Calculate the mean for the following data: a) 14, 11, -10, 8, 8, -16 b) 23, 14, 6, -7, -2, 9, 16 www.itarosley@blogspot.com 15
- 16. CHAPTER 1 Descriptive Statistics Example 1. Find the median of the following examination scores: 80, 56, 34, 67, 55, 91, 82, 47, 75, 31, 90 2. The following data represent the number of home runs hits by all teams in the Indian League in 2004. 157 133 189 215 208 139 152 167 202 197 124 239 191 169. Find the median of this data set. 3. The data below represent the length (in seconds) of a random sample of songs released in the 90’s. 198 255 287 207 176 224 215 208 241 Find the median of the data given. www.itarosley@blogspot.com 16
- 19. CHAPTER 1 Descriptive Statistics Sample variance, s2, for a sample of n data values : www.itarosley@blogspot.com 19
- 20. CHAPTER 1 Descriptive Statistics The variance of the n observations is s 2 ( yi y) 2 ( y1 y ) n 1 2 ... ( y n y) 2 n 1 The standard deviation s is the square root of the variance, s s 2 www.itarosley@blogspot.com 20
- 21. Computing the Variance Formula … for a Population 2 (x ) 2 N Formula s … for a Sample 2 (x n x) 1 2
- 22. CHAPTER 1 Descriptive Statistics Example: Find the sample variance for the given data 6.1 5.7 5.8 6.0 5.8 6.3 Find the variance and std deviation of the following data: 5 2 1 7 6 9 www.itarosley@blogspot.com 22
- 23. CHAPTER 1 Descriptive Statistics Compute the sample variance and std deviation of the heights of the starting players on Team I. www.itarosley@blogspot.com 23
- 24. Organizing Data Variable A characteristic that varies from one person or thing to another Quantitative A numerically valued Qualitative variable A non-numerically valued variable A quantitative variable whose possible values can be listed Discrete www.itarosley@blogspot.com Continuous A quantitative variable whose possible values form some interval of numbers 24
- 25. Organizing Data Grouped frequency distribution -Is obtained by giving classes or intervals together with the number of data values in each class. Cumulative frequency -Is the frequency of a class that includes all values in a data set that fall below the upper boundary of that class Class midpoint or mark lower lim it Upper lim it -Is the number halfway between the lower and upper class limits of a class 2 Class width -Upper boundary – lower boundary www.itarosley@blogspot.com 25
- 26. Organizing Data Example: Given the data below: Construct the frequency distribution table with class limits 42 – 45, 46 – 49, 50 – 53 and so on. www.itarosley@blogspot.com 26
- 27. Organizing Data Construct frequency distribution table and find the class midpoint and class width. The ages of its employees in a company Age 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 No. of Employees 30 35 20 10 5 www.itarosley@blogspot.com 27
- 28. The Ministry of Health Malaysia for Health Statistics publishes data on weights and height by age and sex in Vital and Health Statistics. The weights shown in Table, given to the nearest tenth of pound, were obtained from a sample of 18 – 24 – year-old males. Construct a grouped data table for these weights. Use a class width of 20 and a first cutpoint of 120. Table 6a: Weights of 37 males, aged 18-24 years 129.2 155.2 167.3 191.1 161.7 278.8 146.4 149.9 185.3 170.0 161.0 150.7 170.1 175.6 209.1 158.6 218.1 151.3 178.7 187.0 165.8 188.7 175.4 www.itarosley@blogspot.com 182.5 187.5 165.0 173.7 214.6 132.1 182.0 142.8 145.6 172.5 178.2 136.7 158.5 173.6 28
- 29. Grouped Data Sample Mean • The sample mean of grouped data is: n f i xi i 1 n fi i 1 www.itarosley@blogspot.com 29
- 30. Grouped Data The following data shows the number of mistakes that Redza had done when he typed 100 pages. Find the mean. No. of mistake/s No. of pages 0 60 1 21 www.itarosley@blogspot.com 2 10 3 5 4 3 5 1 30
- 31. Grouped Data Find the mean for the data below that refers to the number of bicycles owned by 27 families at Taman Permata. No. of bicycles No. of families 0 2 1 6 2 13 3 4 4 2 www.itarosley@blogspot.com 31
- 32. Mean , M Mean is the average of data values Ungrouped : The sample mean for raw data: Let x1, x2, ....xn be a sample of size n. Grouped : The sample mean for grouped data: Suppose we have a sample of size n grouped into m groups or cells 32
- 33. Mean , M Mean of sample data is a) Ungrouped data xi x b) Group data x fi x i fi n Mean of population data is a) Ungrouped data xi N where b) Group data fi x i fi xi = class midpoint / mark = (lower limit – upper limit ) / 2 fi = frequency of xi 33
- 34. Median, M Median is the middle value in a ranked list. The data must be arranged in increasing or decreasing order. The are two type of median which are median for ungrouped data and median for grouped data. Ungrouped : The data, a) when n is odd (ganjil) : the median is the value of ( ) th term in ranked list. B) when n is even (genap) : the median = average of the value of the two middle terms Median of sample data is a) Ungrouped data b) Group data Odd (ganjil) n Even (genap) Median LM where 2 F .C f median LM = lower boundary for median class , C = size of class / width, F = cummulative frequency from classes less than the median class fm = frequency in the median class , n = number of data 34
- 35. Median • The median for grouped data is: n M LM F C 2 fm www.itarosley@blogspot.com 35
- 36. A study of sulphur oxide production within 80 days produced the distribution of the following table. Find the median. Sulphur oxide (tonne) 5.0 – 8.9 9.0 – 12.9 13.0 – 16.9 17.0 – 20.9 21.0 – 24.9 25.0 – 28.9 29.0 – 32.9 www.itarosley@blogspot.com Frequency 3 10 14 25 17 9 2 36
- 37. Find the median for the data below that shows the number of visits to the library made by all the 100 international students in one year. Number of visits 0-4 5-9 10-14 15-19 20-24 25-29 No. Of students 17 41 22 11 8 1 www.itarosley@blogspot.com 37
- 38. Mode is the value that occurs most frequently (highest frequency in a data set) Grouped Data : Mode, Mo LM db .C db d a Note : Group Data 1) Data with 2 mode is known as bimode and more 2 mode is multimode Mode for data grouped , www.itarosley@blogspot.com 38
- 39. Frequency 6 10 18 24 16 12 Find the mode of the following data. Class 11 – 15 16 – 20 21 – 25 26 – 30 31 – 35 36 – 40
- 40. Number of visitors Number of days 0 – 99 10 100 – 199 23 200 – 299 167 300 – 399 224 400 – 499 211 500 – 599 107 A Global Warming Awareness Exhibition was held by a state government. The above table recorded the number of visitors who visited the exhibition and the number of days having those numbers of visitors. Find the mode of number of visitors.
- 41. Find the mean, median and mode for the following data: Age Number of people 17 – 21 22 – 26 27 – 31 32 – 36 37 – 41 42 – 46 47 – 51 52 – 56 2 3 5 6 8 7 2 3 www.itarosley@blogspot.com 41
- 42. Sample Variance for Grouped Data The formula for the sample variance for grouped data is: S 2 1 f 1 f i xi 2 f i xi 2 f f is class frequency and X is class midpoint
- 43. Find the variance and std deviation Class Frequency 2 3 4 5 6 7 6 10 15 8 3 10 www.itarosley@blogspot.com 43
- 44. Find the variance and std deviation xi fi 3.0 – 3.4 3.4 – 3.8 3.8 – 4.2 4.2 – 4.6 4.6 – 5.0 4 8 11 9 6 www.itarosley@blogspot.com 44
- 45. Population variance, σ2 The formula for the sample variance for grouped data is: n ( xi 2 ) 2 i 1 N n N 2 n xi 2 xi i 1 i 1 N 2
- 46. Given the data below: 23.3 12.4 58.1 38.2 14.0 58.2 75.4 23.9 23.9 18.3 22.0 37.1 31.4 8.5 1.0 15.5 6.9 5.2 28.7 26.3 13.9 25.9 26.8 26.9 16.8 37.7 10.6 21.9 31.6 30.1 42.4 16.5 21.1 32.9 8.8 10.6 28.6 40.7 12.9 13.8 a) Construct the frequency distribution table with class boundary -0.5 – 9.5, 9.5 – 19.5, 19.5 – 29.5, and so on. b) Find i) Mean ii) Median iii) Mode iv) Standard deviation
- 47. Class limit f 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 30 35 20 10 5 Find the mean, median, mode, standard deviation