Upcoming SlideShare
×

# Statistics

859 views
548 views

Published on

Published in: Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
859
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
0
0
Likes
0
Embeds 0
No embeds

No notes for slide

### Statistics

1. 1. 2013/05/221STATISTICSX-Kit TextbookChapter 9Precalculus TextbookAppendix B: Concepts in StatisticsPar B.2CONTENTTHE GOALLook at ways of summarising a largeamount of sample data in just one or twokey numbers.Two important aspects of a set of data:β’The LOCATIONβ’The SPREADMEASURES OF CENTRAL TENDENCY(LOCATION)Arithmetic Mean (Average)Mode (the highest point/frequency)Median (the middle observation)Number of fraudulent cheques received at abank each week for 30 weeksWeek12 3 4 5 6 7 8 9 105 3 8 3 3 1 10 4 6 8Week1112 13 14 15 16 17 18 19 203 5 4 7 6 6 9 3 4 5Week2122 23 24 25 26 27 28 29 307 9 4 5 8 6 4 4 10 4ARITHMETIC MEANβ’ π =πππππ= π. ππβ’ To calculate the MEAN add all the data pointsin our sample and divide by die number ofdata points (sample size).β’ The MEAN can be a value that doesnβtactually match any observation.β’ The MEAN gives us useful information aboutthe location of our frequency distribution.
2. 2. 2013/05/222GRAPH0123456781 2 3 4 5 6 7 8 9 10FrequencyFrequencyCALCULATE THE MEANRaw Dataβ’ π₯ =π₯πβ’ π₯ is datapointsβ’ π is numberofobservationsFrequencyTableβ’ π₯ =π₯ππβ’ π₯ is datapointsβ’ π is numberofobservationsβ’ π is thefrequencyFrequencyTable (Intervals)β’ π₯ =π₯ππβ’ π₯ is midpointsfor intervalsβ’ π is numberofobservationsβ’ π is thefrequencyCALCULATE THE MEAN - FREQUENCY TABLE:NUBEROFFRAUDULENT CHEQUESPERWEEKDistinct Values TallyMarks Frequency1 / 12 03 //// 54 //// // 75 //// 46 //// 47 // 28 /// 39 // 210 // 2Truck Data: weights (in tonnes) of 20 fullyloaded trucksTruck12 3 4 5 6 7 8 9 10Weight4.543.81 4.29 5.16 2.51 4.63 4.75 3.98 5.04 2.80Truck1112 13 14 15 16 17 18 19 20Weight2.525.88 2.95 3.59 3.87 4.17 3.30 5.48 4.26 3.53CALCULATE THE MEAN - GROUPEDFREQUENCY TABLE:TruckData: weights(intonnes)of20fullyloadedtrucksClass Intervals Frequency Midpointπ. π β€ π β€ π. π 4 π. π + π. π Γ· π = 2.75π. π < π β€ π. π 1 3.25π. π < π β€ π. π 5 3.75π. π < π β€ π. π 3 4.25π. π < π β€ π. π 3 4.75π. π < π β€ π. π 3 5.25π. π < π β€ π. π 1 5.75MODEβ’The mode is the interval with theHIGHEST FREQUENCY.β’There can be two or more modes in a setof data β then the mode would not be agood measure of central tendency.β’MULTI-MODAL data consist of more thanone mode.β’UNI-MODAL data consist of only onemode.
3. 3. 2013/05/223GRAPH: The MODE = 40123456781 2 3 4 5 6 7 8 9 10FrequencyFrequencyCall Centre Data: waiting times (in seconds)for 35 randomly selected customersC1 2 3 4 5 6 7 8 9 10 11 1275 37 13 90 45 23 104 135 30 73 34 12C13 14 15 16 17 18 19 20 21 22 23 2438 40 22 47 26 57 65 33 9 85 87 16C25 26 27 28 29 30 31 32 33 34 35102 115 68 29 142 5 15 10 25 41 49FREQUENCY TABLE: The MODAL CLASS is theinterval ππ < π β€ ππClass Intervals TallyMarks Frequency0 β€ π₯ β€ 25 //// //// 1025 < π₯ β€ 50 //// //// / 1150 < π₯ β€ 75 //// / 675 < π₯ β€ 100 /// 3100 < π₯ β€ 125 /// 3125 < π₯ β€ 150 // 2HISTOGRAM: MODAL CLASS (ππ < π β€ ππ]024681012Intervals[0;25](25;50](50;75](75;100](100;125](125;150]THE MEDIAN β RAW DATA:Numberoffraudulentchequesreceived atabankeach weekfor30weeksWeek12 3 4 5 6 7 8 9 105 3 8 3 3 1 10 4 6 8Week1112 13 14 15 16 17 18 19 203 5 4 7 6 6 9 3 4 5Week2122 23 24 25 26 27 28 29 307 9 4 5 8 6 4 4 10 4MEDIANβ’ Median = 5β’ Put all observations in order from smallest tolargest, then the middle observation is theMEDIAN.1, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5,5, 6, 6, 6, 6, 7, 7, 8, 8, 8, 9, 9, 10, 10
4. 4. 2013/05/224DONβT FALL INTO THE COMMON TRAPβ’ The median is NOT the middle of the range ofobservations, for example1, 1, 1, 1, 1, 3, 9ο±The median is 1 (the middle observation).ο±The middle of the range (9 β 1) is 5! Bigdifference!MEDIANOdd Number ofObservations,for example 7Median Positionπ+ππEven Number ofObservations,for example30Median Positionhalf-way betweenπππππ (ππ+ π)FINDTHE MEDIAN -FREQUENCYTABLE:NUBER OF FRAUDULENT CHEQUES PERWEEKDistinct Values Frequency CumulativeFrequency1 1 12 0 13 5 64 7 135 4 176 4 217 2 238 3 269 2 2810 2 30FIND THE MEDIAN - GROUPED FREQUENCYTABLE:TruckData: weights(intonnes)of20fullyloadedtrucksClassIntervals Frequency Midpointπ. π β€ π β€ π. π 4 π. π + π. π Γ· π = 2.75π. π < π β€ π. π 1 3.25π. π < π β€ π. π 5 3.75π. π < π β€ π. π 3 4.25π. π < π β€ π. π 3 4.75π. π < π β€ π. π 3 5.25π. π < π β€ π. π 1 5.75FIND THE MEDIAN FROM A GROUPEDFREQUENCY TABLEβ’Median (middle observation)?β’Find the class interval in which thatobservation lies.?CALCULATIONSRaw DataMeanModeMedianFrequency Table(UngroupedData)MeanModeMedianFrequency Table(Grouped Data)MeanModeMedian
5. 5. 2013/05/225HOW TO CHOOSE THE BEST MEASURE OFLOCATION?β’ When choosing the best measure of location, weneed to look as the SHAPE of the distribution.β’ For nearly symmetric data, the mean is the bestchoice.β’ For very skewed (asymmetric) data, the mode ormedian is better.β’ The mean moves further along the tail than themedian, it is more sensitive to the values far fromthe centre.SYMMETRIC histogram:Mean = Median = ModeA POSITIVELY SKEWED (skewed to the right)histogram has a longer tail on the right side:Mode < Median < MeanA NEGATIVELY SKEWED (skewed to the left)histogram has a longer tail on the left side:Mean < Median < ModePROBLEMβ’We can find two very different data sets (onedistribution very spread out and another veryconcentrated) with measures of centraltendency EQUAL.β’To find a true idea of our sample, we have toMEASURE THE SPREAD OF A DISTRIBUTION,called the spread dispersion.MEASURESOF SPREAD(DISPERSION)Interquartile RangeVarianceStandard Deviation
6. 6. 2013/05/226MEASURINGSPREADβ’Think of a distribution in terms ofpercentages, a horizontal axis equally dividedinto 100 percentiles.β’The 10th percentile marks the point belowwhich 10% of the observations fall, andabove which 90% of observations fall.β’The 50th percentile, below which 50% of theobservations lie, is the median.WORKINGWITH A PERCENTILEβ’ π% of the observationfall belowthe π π‘β percentile.π·πππππππ =πππππ + πβ’ Workingwith the example on fraudulentcheques:1, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6,7, 7, 8, 8, 8, 9, 9, 10, 10π· ππ =πππππππ + π = ππ. πβ’ 15.5 tells us where to find our 50th percentile.β’ 15 tells us which observation to go to, and 0.5 tells us how far tomove along the space between that observation and the nexthighest one.FORMULAβ’ π· ππ = π ππ + π. π π ππ β π πππ· π = π π + π π π+π β π πβ’ π means percentileβ’ π tell us which percentileβ’ π the whole number calculated from thepositionβ’ π the decimal fraction calculated from thepositionWORKINGWITH PERCENTILESFROMUNGROUPEDFREQUENCYDATA:NUBEROFFRAUDULENT CHEQUESPERWEEKDistinct Values Frequency Cumulative Frequency1 1 12 0 0 + 1 = 13 5 1 + 5 = 64 7 6 + 7 = 135 4 13 + 4 = 176 4 17 + 4 = 217 2 21 + 2 = 238 3 23 + 3 = 269 2 26 + 2 = 2810 2 28 + 2 = 30WORKING WITH PERCENTILES (ANDMEDIAN) FROM GROUPED DATAβ’ To identify the class interval π³ < π β€ πΌ containing theπ π‘β percentile:π·πππππππ =πππππ + πβ’ The decimal fraction for grouped data is:π =π·πππππππβπΊππ ππ πππππ πππππππππππ ππ π³π­ππππππππ ππ πππππ π³ < π β€ πΌβ’ Calculate the π π‘β percentile:π· π β π³ + π πΌ β π³FIND THE MEDIAN - GROUPED FREQUENCYTABLE:TruckData: weights(intonnes)of20fullyloadedtrucksClass Intervals Frequency CumulativeFrequencyπ. π β€ π β€ π. π 4 4π. π < π β€ π. π 1 5π. π < π β€ π. π 5 10π. π < π± β€ π. π 3 13π. π < π β€ π. π 3 16π. π < π β€ π. π 3 19π. π < π β€ π. π 1 20