Wynberg girls high-Jade Gibson-maths-data analysis statistics

4,243 views

Published on

Powerpoint slides for data analysis in statistics

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,243
On SlideShare
0
From Embeds
0
Number of Embeds
2,073
Actions
Shares
0
Downloads
99
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Wynberg girls high-Jade Gibson-maths-data analysis statistics

  1. 1. Data Analysis Chapter 10
  2. 2. Types of Data <ul><li>Quantitative data is data recorded with numbers </li></ul><ul><ul><li>eg: learner’s weight or number of goals </li></ul></ul><ul><li>Qualitative data is data recorded in words </li></ul><ul><ul><li>eg: favourite colours </li></ul></ul>
  3. 3. … types of data cont. <ul><li>Within these two types of data we can also look at … </li></ul><ul><ul><li>Discrete data – information collected by counting (1, 2, 3 … no halves/quarters etc) </li></ul></ul><ul><ul><li>Continuous data – information collected by measurement (may have decimals and fractions) </li></ul></ul><ul><ul><li>Do Ex 10.8 Q1 (Pg 232) </li></ul></ul>
  4. 4. Data Interpretation <ul><li>Once data has been collected and sorted, it has to be interpreted and analysed </li></ul><ul><li>Two types of interpretation: </li></ul><ul><ul><li>Pictorial methods: involve drawing graphs </li></ul></ul>
  5. 5. <ul><ul><li>Arithmetic methods: involve working out: </li></ul></ul><ul><ul><ul><li>Measures of central tendency – mean median and mode </li></ul></ul></ul><ul><ul><ul><li>Measure of dispersion – range, percentiles, quartiles and the interquartile range </li></ul></ul></ul>
  6. 6. Displaying data (Pictorial methods) <ul><li>Histograms – no gaps (quanitative data) </li></ul><ul><li>Bar Graphs – bars do not touch </li></ul><ul><li>Compounded bar graphs </li></ul><ul><ul><li>Dual bar graph – data displayed next to each other </li></ul></ul><ul><ul><li>Sectional bar graph – data displayed ‘on-top of one another’ </li></ul></ul><ul><li>Pie Charts </li></ul><ul><li>Broken line graphs </li></ul>
  7. 7. Ex. 10.2 (3)
  8. 8. 10.3 (1)
  9. 9. 10.4 (2)
  10. 10. Misleading graphs <ul><li>Ways graphs/charts can be misleading: </li></ul><ul><ul><li>Using 3D in pictograms/bar-charts </li></ul></ul><ul><ul><li>Using perspective/shape to exaggerate </li></ul></ul><ul><ul><li>Reversing the direction of an axis (to make a decrease seem like an increase) </li></ul></ul><ul><ul><li>Altering the scale of the y-axis (to make it look more or less steep) </li></ul></ul><ul><ul><li>Leaving part of the axis out to exaggerate differences </li></ul></ul><ul><ul><li>http://www.coolschool.ca/lor/AMA11/unit1/U01L02.htm# </li></ul></ul>
  11. 11. Misleading statistics <ul><li>Stats are notorious for being made up or misleading </li></ul><ul><li>E.g.: during a political debate in USA , a member of the opposition claimed that employment had gone up during the President’s term of office; yes it had … but only because the population had increased, the number of unemployed people had also increased. </li></ul>
  12. 12. <ul><li>“ 86 % of statistics are made up on the spot and the remaining 24% are flawed” </li></ul>
  13. 13. Measures of central tendency <ul><li>Mean, mode and median </li></ul><ul><li>“Averages” </li></ul>
  14. 14. … <ul><li>Mean (x) is like the average: </li></ul><ul><ul><li>Mean = sum of values </li></ul></ul><ul><ul><ul><ul><li>number of values </li></ul></ul></ul></ul><ul><ul><li>Can be affected by outliers, so not a good measure of central tendency if outliers </li></ul></ul>
  15. 15. … <ul><li>Median is the one in the middle when placed in numerical order (smallest to biggest) </li></ul><ul><ul><li>If there are outliers then median is a better measure of central tendency </li></ul></ul><ul><li>Mode/Modal value is the value that appears the most </li></ul>
  16. 16. Things which can help with measures of central tendency <ul><li>Frequency tables </li></ul><ul><ul><li>Simple tables </li></ul></ul><ul><ul><li>Or for grouped data </li></ul></ul><ul><li>Stem and Leaf diagrams – these are especially helpful for data with more than ten items </li></ul>
  17. 17. 10.9 (3) 5 19 8, 6, 3 20 7, 5, 5, 4, 0 0, 0 18 9, 9, 4 6, 6, 8, 8 17 9, 6, 3 2, 4, 5, 5, 7 16 0 6, 8, 8, 8 15 5, 1, 0 14 9, 6 13 9 12 4 11 0, 8 10 Robert Jabu Heights of mealie plants (in cm)
  18. 18. 10.10 (1) 9 2 0 9 0 10 0, 2, 2, 7 8 0, 2, 7, 8, 9 7 4, 5, 9 6 0, 1, 2, 2, 2, 4, 4, 6, 9, 9 5 4, 5, 5, 5, 5, 7, 7 4 2, 6, 9 3 Leaves Stem
  19. 19. Grouped data <ul><li>When the data has many different measurements involved in it, the data is usually grouped in intervals (classes). Try to have between 8 and 14 classes. And start with a value below the minimum in the data. </li></ul><ul><li>Tally : lines used to count up the frequency of scores </li></ul><ul><li>Frequency is the number of times that score/value appears </li></ul>
  20. 20. Example of a ‘Grouped data table’ <ul><li>Midpoint is the midpoint of that interval; calculated as on the table above </li></ul><ul><li>fX = frequency multiplied by midpoint </li></ul>48 8 6 //// / 6-10 9 (1+5) ÷2= 3 3 /// 1-5 fX (Frequency x midpoint) Midpoint (X) Frequency (f) Tally Classes
  21. 21. Analysing the grouped data <ul><ul><li>We can calculate: </li></ul></ul><ul><ul><ul><li>Actual mean (x) = sum of values </li></ul></ul></ul><ul><ul><ul><li> number of values </li></ul></ul></ul><ul><ul><ul><li>Estimated mean (X) = sum of ‘fX’ values </li></ul></ul></ul><ul><ul><ul><ul><ul><li>number of values </li></ul></ul></ul></ul></ul><ul><ul><li>We can draw a graph using the data: </li></ul></ul><ul><ul><ul><li>eg: a histogram with ‘classes’ on the x-axis and ‘frequency’ on the y-axis </li></ul></ul></ul>
  22. 22. … <ul><ul><li>We can find both a mode and modal class: </li></ul></ul><ul><ul><ul><li>Mode: value that appears most </li></ul></ul></ul><ul><ul><ul><li>Modal class: class (interval) with highest frequency </li></ul></ul></ul><ul><ul><li>We can estimate the median from a histogram: </li></ul></ul><ul><ul><ul><li>By estimating the value at which the ‘area’ of the histogram is divided into two equal parts </li></ul></ul></ul>
  23. 23. Histograms and frequency polygons <ul><li>Histograms and frequency polygons are both ‘frequency graphs’ </li></ul><ul><ul><li>The difference between them is that the histogram is made up of bars, whereas the frequency polygon is a line graph </li></ul></ul><ul><ul><li>The ‘polygon’ is made from the lines of the graph and the horizontal axis </li></ul></ul>
  24. 24. Drawing Frequency Polygons (2 methods) <ul><li>1) Using the bars of a histogram </li></ul><ul><ul><li>Mark the midpoint of the top of each bar </li></ul></ul><ul><ul><li>Join the points; including two points at zero on either side of the histogram </li></ul></ul>
  25. 25. … <ul><li>2) Without using a histogram: </li></ul><ul><ul><li>Plot the midpoint of each interval against the frequency </li></ul></ul><ul><ul><li>Join the points; and add the two “zero” points on either side as with the histogram </li></ul></ul>
  26. 26. Measures of Dispersion <ul><li>Tell us how the data is grouped around the “average” </li></ul><ul><li>Is it closely grouped, or scattered widely? </li></ul><ul><li>Measure of spread, scattering or dispersion of scores </li></ul>
  27. 27. Range <ul><li>Range = largest value – smallest value </li></ul><ul><ul><li>Has a few limitations in that it cannot be used for ‘grouped data’; and it doesn’t tell us anything about the distribution of the values between the largest and smallest </li></ul></ul><ul><ul><li>For this reason we can also look at quartiles, deciles and/or percentiles </li></ul></ul>
  28. 28. Quartiles, Percentiles and Deciles <ul><li>Quartiles : are points that subdivide the data into quarters </li></ul><ul><li>Deciles : are points that subdivide the data into tenths </li></ul><ul><li>Percentiles : are points that subdivide the data into hundredths </li></ul>
  29. 29. Quartiles <ul><li>First/lower quartile (Q 1 ) : is one quarter of the way through the data set when ordered from lowest to highest </li></ul><ul><li>Second quartile (Q 2 ) = median </li></ul><ul><li>Third/upper quartile (Q 3 ) : is three quarters of the way through the data set (in order) </li></ul>
  30. 30. <ul><li>Interquartile range = third quartile – first quartile </li></ul><ul><li>The interquartile range is a better measure of dispersion than the range as it is not affected by ‘extreme’ values </li></ul><ul><li>It indicates how densely the data is spread around the median </li></ul>
  31. 31. <ul><li>Semi-quartile range = Q 3 – Q 1 </li></ul><ul><ul><ul><ul><ul><li> 2 </li></ul></ul></ul></ul></ul><ul><li>It is half of the interquartile range </li></ul>

×