Analyzing Data There are three kinds of lies - lies, damned lies and statistics. ~Benjamin Disraeli Advanced Biology Mrs. Morgan
Using DataStatistics: The only science that enables different experts using the same figures to draw different conclusions. - Evan Esar After collecting data during lab investigations there are many ways to organize and analyze it.
Presenting Data• Always present data in charts Subject HR HR and graphs as well as in Before Exercise After Exercise 1 60 84 words 2 76 80 3 62 90• Example: 4 78 110 5 70 92 – Table 1 shows the heart rate of 6 66 92 subjects before and after 7 70 88 exercise. The average of 8 74 80 subjects’ heart rates shows a rise of 10.2 beats per minute 9 78 100 after exercise. 10 68 88 Avg 70.2 80.4
Simple Data AnalysisMean (average): sum of allmeasurements divided by the Exampletotal # of measurements (duh…) Data set: 2 4 5 7 10Median: the middle number in a Meanseries of measurements. (2+4+5+7+10)/5 = 5.6 Median middle number = 5Range: the difference between Rangethe highest and lowest values in a 10 – 2 = 8series of measurements
More AnalysisThe Q-Test – Used to determine if a data point should be left out of analysis calculations – Example: data set includes 45, 48, 52, 43, 89, 56, 48, 47, 44, 51, 50 (One of these things is not like the others…) A Q-test decides if the analysis of the data set should include the 89 or not
Q-Test Q = gap Gap: distance between the outlier and nearest data point range 45, 48, 52, 43, 89, 56, 48, 47, 44, 51, 50 Q = (89-56) = 33 = .717 It helps to put the data points in (89-43) = 46 numerical order So what do we do with this number?
Q- Test Use a Q-table for the expected Q valueN-1 Q-value N = number of data points3 .94 N-1 = 104 .765 .64 If calculated Q value is greater6 .56 than expected Q value -7 .51 discard the data point8 .47 Qcalc = .717 > Qexp = .419 .44 Discard point 8910 .41
The last and most useful type of analysis The T-Test• Determines if the averages of two sets of results are statistically different from each other, thus allowing for a confident conclusion to be made• The chance that the results are due to coincidence must be below 5%
Say what?Statistically different: t-test result is less than 0.05What this means: if results are statistically different, there is less than a 5% chance the results are coincidence - therefore your hypothesis is more likely to be supported Calculate a t-test value for 2 sets of data and compare it to . 05
Types of Data in a T-Test• Tails: – One-tailed: experimenter has expected results (one group being higher/lower than another) – Two-tailed: experimenter only assumes a difference in results• Paired/Two-Sample – Paired: same group used in each experiment; dependent (before and after) – Two-Sample: two separate groups; independent (men v.women)
T-Test FormulaIn words: the mean of the first set minus the mean of the second set over the square root of the variance of each group divided by the number of results in each group. That’s a crap load of math – we’ll use PowerPoint
Using Microsoft ExcelOpen the program andcreate a new workbook.Under “View” choose tosee the “Formula Builder”
T-Test using Microsoft Excel Type your data in, using one column for each group of results:
T-Test using Microsoft Excel• Find the average for each set of data: – Select the group of data – Click on the equal (=) sign at the top of the screen – A window unfolds that looks like this:
T-Test using Microsoft Excel• Select “average” from the pull-down menu, and a screen appears:
T-Test using Microsoft Excel • To take a t-test, choose an empty cell and enter a “=“ which will bring up the formula builder. • If “TTEST” isn’t on the list of functions, search for it at the top of the builder. • Double click on “TTEST”
T-Test using Microsoft Excel Fill in the required data:• Each of the categories are described• Array = group of data (highlight the column to select group – don’t include any headings)• Tails = one or two tailed (1 or 2)• Type = paired or two-sample (1 or 2) And the answer just appears…
Tips for a Better T-Test• The more results you have, the better and more accurate the results.• If you have several sets of results, perform t-tests for all of them versus each other.• The columns of data can also be used to generate graphs if the lab calls for it.
Works Cited• http://trochim.human.cornell.edu/kb/stat_t.htm• http://davidmlane.com/hyperstat/A29337.html