Upcoming SlideShare
×

1,532 views

Published on

Preconference workshop, Charleston Conference, November 7, 2006

Published in: Education
4 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,532
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
41
0
Likes
4
Embeds 0
No embeds

No notes for slide
• Science &amp; Electronic Resources Librarian Libraries of the Claremont Colleges
• Part I will be an overview of developing a research project with the aim of using statistics as a methodology in the analysis. Part II will be an overview of statistical concepts and language. Part III will be an exercise in evaluating library statistics.
• Three key concepts to remember when designing a research project of any kind, but especially statistical projects are Validitiy, Reliability and Generalizability. Validity is how well a variable measures a particular concept; For example – if we are measuring use, is it valid to count reshelving figures as use? Fulltext downloads? Reliability, the consistency of the variable, measurement, or test; One basic of scientific and statistical analysis is that the results can be confirmed by others repeating the experiment or data analysis. Without reliability, we would have Generalizability, means can the results be applied to other situations. For example: if you observed students using group study rooms in this library, could you generalize that use across all hours, days, buildings, user groups, or other institutions? If you can, your research can become a general model, a universal law, or immutable truth. But if you can’t, that just means that the results are applicable in common situations.
• Designing research includes formulating initial hypotheses, or statements about what the researcher thinks the data will show, data collection (and manipulation) through a variety of techniques, and statistical analysis that is suited to the hypotheses and data. Here are the key stages for doing research. Statistics are only a tool to help us understand the outcome of the research. Much research can be done not employing statistical techniques – most ethnographic research relies on direct observation and not on analysis of statistics. Take medicine for example: drug works, drug is safe, prescribe drug. Observational data or microscopic data may suffice. But most research relies on statistical analysis of research data, no matter how it’s collected.
• There are two basic designs to sample: a simple random sample and a stratified random sample, and they are pretty similar. Draw a single circle and a circle composed of other circles inside to provide visual aid. A simple random sample is what it sounds. A group of subjects is chosen from the whole population and each subject has an equal chance of being samples. If you took the campus directory and randomly selected a 100 names, that would be a simple random sample. Draw examples comparing simple and stratified. A stratified random sample is a bit more complex. It assumes that your population is composed of different types of individuals and that you want some knowledge about each group. For example, libraries often want to know how well they serve their communities and want to know something about students, faculty and staff. Are they meeting each of their needs? The solution to this problem is to divide up the population into each group and then randomly sample each group. Samples from each group are generally proportional to the size of each population.
• Now comes to a really fun and interactive part of the workshop. In this study, we are going to sample M&amp;Ms and try to figure out the frequency of colors. Not only that, but we’re going to test our results against what the Mars Candy Company says should be the frequency. Lets think about our M&amp;M packs. At the plant, the company loads millions of these little candies into a big hopper and tries to mix them so that they are randomly distributed. When they get packaged, the company wants you to get some of each color, but does not regulate the number of colored candies going into your package – some may have more blues, some may have more yellows. Each of these packages, you can consider a random sample of the large hopper or bin of M&amp;M candies. And if we sample enough of these packages, we should start getting close to the distribution of colors at the company. Remember, we are doing samples because we don’t have enough money to count all M&amp;Ms sold in every store.
• How is accuracy affected by size of sample? What would explain a difference between our observed results and M&amp;M’s reported figures? Was our sample a good representation of the population? Is our methodology valid? Are our results generalizable?
• A review of the Basic Statistics for Librarians workshop. The five components were statistical concepts, evaluation of literature, sampling, an introduction to usage statistics, and designing a research study. Concepts included frequency distributions including flat (no change), normal ( a bell curve shape), and skewed (very many sloping to very few or vice versa). Mean is the average of a group and median is the middle value of a set of ordered values. A standard deviation is the measure of the dispersion or variation in a sample. For a normal distribution, 68% of the data is found within +/- 1 SD, 95% is +/- 2, and over 99% is +/- 3. Three key concepts to remember when evaluating literature are Validity, how well a variable measures the concept being studied; Reliability, the consistency of the variable, measurement, or test; and Generalizability, can the results be applied to other situations. Sampling is the act of drawing a portion of subjects to measure from a larger population. A random sample is the strongest type of sample since it is assumed to be a fair representation of the population. More complex sampling includes stratified random sampling, where a portion from each representation group of the population is taken, or convenience sampling where the sample includes a non-random sample. Sample size is important and for small populations, more subjects are needed. Usage statistics are very important in applied librarianship today but researchers need to remember to ask questions such as what is being measured, who did what is measured, why they did it, and how many of them did it. Most datasets will include outliers and missing data that can impact the statistical tests but there are many techniques for dealing with these problem data.
• Quiz – scale these types of data? Take a few minutes to write down what type of data these are, then we’ll go over them: Salary: ratio Author: nominal Hours: ratio Patron: nominal Publication: interval Ranked: Ordinal Tests: interval Articles: interval FTE: ratio
• This is a histogram of fulltime enrollments at ARL schools Fulltime students average about 22 thousand The standard deviation is about 10 thousand. QUIZ How many schools fall between 12 and 32 thousand students? Answer: 68%
• Lets now look at some real data from libraries to apply the concepts of mean and standard deviation This is histogram I generated from data I collected from American Research Libraries on total salaries and wages. There are 114 libraries included in this histogram Mean salary and wages at ARL libraries is about 10 million SD is about 6 and a half million
• How do you get a Law named after you? The key stages of statistical research, for collection &amp; analysis, I’ve just listed a few examples and will briefly go over them.
• I’ll explain this in depth – how to get DF, how to do Chi-Square, etc.
• I’ll explain this in depth – how to get DF, how to do Chi-Square, etc.
• Designing research includes formulating initial hypotheses, or statements about what the researcher thinks the data will show, data collection (and manipulation) through a variety of techniques, and statistical analysis that is suited to the hypotheses and data. Here are the key stages for doing research. Statistics are only a tool to help us understand the outcome of the research. Much research can be done not employing statistical techniques – most ethnographic research relies on direct observation and not on analysis of statistics. Take medicine for example: drug works, drug is safe, prescribe drug. Observational data or microscopic data may suffice. But most research relies on statistical analysis of research data, no matter how it’s collected.
• I’ll state some general introduction about each types of analysis. And then introduce Nichols’ et. al study as the first example.
• Everyone will read the article and then we’ll go through these together, with each item coming out after someone states it.
• After going through this, we’ll discuss what the study did right (pretest, posttest, survey), and did wrong, including assumptions (Not stating the null hypotheses, accepting the alternate hypothesis when should have been rejected.
• As a group, the participants will read through this study and come up with the answers to the 5 questions, with discussion centering around the reliabilibity, validity, and generalizability, with a focus on finding out if the methods, variables, and tests fit the question.
• Everyone will read the article and then we’ll go through these together, with each item coming out after someone states it.
• After going through this, we’ll discuss what the study did right (pretest, posttest, survey), and did wrong, including assumptions (Not stating the null hypotheses, accepting the alternate hypothesis when should have been rejected.
• Just for fun….
• ### Advanced statistics for librarians

1. 1. Advanced Statistics for Librarians How to use and evaluate statistical information in library research <ul><ul><li>Claremont Colleges </li></ul></ul>Caltech <ul><ul><li>Science & Electronic Resources Librarian </li></ul></ul>Acquisitions Librarian <ul><ul><li>Jason Price </li></ul></ul>John McDonald
2. 2. Advanced Statistics <ul><li>Part I : Research Design </li></ul><ul><li>Part II : Statistical Concepts </li></ul><ul><li>Part III : Evaluating Library Statistics </li></ul>
3. 3. Research Design <ul><li>Validity </li></ul><ul><ul><li>How well an indicator accurately measures the concept being studied. Is the technique appropriate to measure the concept being studied? </li></ul></ul><ul><li>Reliability </li></ul><ul><ul><li>How consistent is the measurement. Does it yield the same results over repeated attempts and by different researchers? How certain are the results? </li></ul></ul><ul><li>Generalizability </li></ul><ul><ul><li>How well (or likely) can the findings be applied to other situations? </li></ul></ul>
4. 4. Research Design Steps <ul><li>Research Question </li></ul><ul><li>Hypotheses </li></ul><ul><li>Data definitions </li></ul><ul><li>Data collection </li></ul><ul><li>Data analysis </li></ul><ul><li>Conclusions </li></ul>
5. 5. Research Question <ul><ul><li>What is the study designed to answer? </li></ul></ul><ul><ul><li>Why is the study important? </li></ul></ul><ul><ul><li>The more specific, the better! </li></ul></ul><ul><ul><li>Example: Should the library increase hours during finals week? </li></ul></ul>
6. 6. Hypothesis <ul><ul><li>A statement about the expected results. </li></ul></ul><ul><ul><li>What you will test after collecting data. </li></ul></ul><ul><ul><li>Null Hypothesis , that there is no difference between Group 1 & Group 2 or Before/After. Notated H o = H a </li></ul></ul><ul><ul><li>Alternate Hypothesis , that there is a difference and what that difference will be. Notated H o ≠ H a </li></ul></ul><ul><ul><li>Can also be directional if theory or prior research indicates : H o > H a </li></ul></ul>
7. 7. Data collection <ul><ul><li>Observation </li></ul></ul><ul><ul><li>Interviews </li></ul></ul><ul><ul><li>Focus Groups </li></ul></ul><ul><ul><li>Surveys </li></ul></ul><ul><ul><li>Transaction Logs </li></ul></ul><ul><ul><li>Others? </li></ul></ul>
8. 8. Data Collection: Sampling <ul><ul><li>Necessary when it is impossible to study an entire population due to logical, geographical, monetary, or time constraints. </li></ul></ul><ul><ul><li>A sample must be a good representation of the rest of the population. </li></ul></ul><ul><ul><li>The larger your sample, the more sure you can be that their answers truly reflect the population </li></ul></ul><ul><ul><li>Accuracy increases when more respondents pick one choice over another. E.g. More accuracy when 99% choose one presidential candidate </li></ul></ul><ul><ul><li>The larger your population size, the larger your sample needs to be, except if your population is very large (i.e. the U.S., or very small (i.e. your household) </li></ul></ul>
9. 9. Simple Stratified Assumes homogeneity Assumes heterogeneity Sampling Designs
10. 10. <ul><li>1) SS = Z 2 * (p) * (1-p) / c 2 </li></ul><ul><li>2) ss = SS/1+(SS-1/pop) </li></ul><ul><li>When you have very large pop size </li></ul><ul><li>When you have finite pop size </li></ul><ul><li>Z = Z value (e.g. 1.96 for 95% confidence level) </li></ul><ul><li>p = percentage picking a choice, expressed as decimal (e.g. .5 for 50%) </li></ul><ul><li>c = confidence interval, expressed as decimal (e.g., .04 = ±4%) </li></ul>Sample size spreadsheet Calculating Sample Sizes
11. 11. <ul><li>Research Question : What is the color distribution of M&Ms? </li></ul><ul><li>Sample : What is the color distribution of a simple random sample of M&Ms. </li></ul><ul><li>Test : Does my sample yield different results than what is reported by the company? </li></ul><ul><li>Method : Packages of M&Ms distributed to each participant. Each package is a random sample from the company. </li></ul>M&M Sampling
12. 12. <ul><li>Let’s look at the colors in individual samples of M&Ms </li></ul><ul><li>M&M Data Collection & Testing </li></ul>M&M Sampling
13. 13. Data Definitions <ul><li>Data Scales </li></ul><ul><ul><li>Nominal </li></ul></ul><ul><ul><li>Ordinal </li></ul></ul><ul><ul><li>Interval </li></ul></ul><ul><ul><li>Ratio </li></ul></ul><ul><li>Frequency Distributions </li></ul><ul><ul><li>Flat </li></ul></ul><ul><ul><li>Normal </li></ul></ul><ul><ul><li>Skewed </li></ul></ul><ul><li>Variable Types </li></ul><ul><ul><li>Dependent </li></ul></ul><ul><ul><li>Independent </li></ul></ul><ul><ul><li>Extraneous </li></ul></ul>
14. 14. Data Scales <ul><li>Nominal : scaled without order, indicating that classifications are different. Example : Public & private institutions. </li></ul><ul><li>Ordinal : scaled with order, but without distance between values. Example : Carnegie classifications </li></ul><ul><li>Interval : scaled with order and establishes numerically equal distances on the scale. Example : Grade level (freshman, sophomore, etc.) </li></ul><ul><li>Ratio : scaled with equal intervals and a zero starting point. Example : Fulltext downloads. </li></ul><ul><li>Nominal or ordinal variables are discrete , while interval and ratio variables are continuous </li></ul>
15. 15. Name that data type! <ul><li>Salary </li></ul><ul><li>Author of a book </li></ul><ul><li>Hours spent in the library </li></ul><ul><li>Patron status </li></ul><ul><li>Publication year of a journal </li></ul><ul><li>Ranked journal lists </li></ul><ul><li>Test results on instruction classes </li></ul><ul><li>Number of articles read </li></ul><ul><li>FTE </li></ul>
16. 16. Data Distributions <ul><li>Described by their kurtosis (variability) and skew (extremes) </li></ul>Non-normal (skewed): extreme values with steep slopes Normal : bell shaped curve with gradual slopes
17. 17. Fulltime Students at ARL Schools N=114 Mean = 22K SD = 10K
18. 18. Total Salaries & Wages at ARL Libraries N=114 Mean = 10M SD = 6.5M
19. 19. Variables <ul><li>Dependent: the variable being measured, studied, and predicted. </li></ul><ul><li>Independent : variables that can be manipulated or are predictors of the dependent variable. </li></ul><ul><li>Extraneous : variables other than the independent variables that can influence the dependent variable. </li></ul>
20. 20. Data analysis <ul><ul><li>Descriptive statistics </li></ul></ul><ul><ul><ul><li>Mean, Median, Mode </li></ul></ul></ul><ul><ul><ul><li>Standard Deviation </li></ul></ul></ul><ul><ul><li>Correlational statistics </li></ul></ul><ul><ul><ul><li>Correlation </li></ul></ul></ul><ul><ul><li>Inferential statistics </li></ul></ul><ul><ul><ul><li>T-test </li></ul></ul></ul><ul><ul><ul><li>Regression </li></ul></ul></ul><ul><ul><ul><li>Chi-square </li></ul></ul></ul><ul><ul><ul><li>ANOVA </li></ul></ul></ul>
21. 21. Correlational Statistics <ul><li>Correlation establishes that two measures have a relationship. </li></ul><ul><ul><li>Indicates direction & strength, but not causation! </li></ul></ul><ul><ul><li>Allows researcher to consider other statistical tests with confidence. </li></ul></ul><ul><ul><li>Requirements </li></ul></ul><ul><ul><ul><li>random sample </li></ul></ul></ul><ul><ul><ul><li>interval or ratio data </li></ul></ul></ul><ul><ul><ul><li>normal distribution </li></ul></ul></ul><ul><ul><ul><li>linear relationship </li></ul></ul></ul>
22. 22. Correlational Statistics <ul><li>Direction </li></ul><ul><ul><li>Positive: As one value increases, the other does as well. </li></ul></ul><ul><ul><ul><li>Example : Age and height. </li></ul></ul></ul><ul><ul><ul><li>Library : Enrollment & materials budget. </li></ul></ul></ul><ul><ul><li>Negative: As one value increases, the other decreases. </li></ul></ul><ul><ul><ul><li>Example : Car speed & time to destination. </li></ul></ul></ul><ul><ul><ul><li>Library : Items purchased & shelf space. </li></ul></ul></ul><ul><li>Strength </li></ul><ul><ul><li>Value between 1 (positive) and -1 (negative). The closer to those values, the stronger the relationship. </li></ul></ul>
23. 23. Correlation
24. 24. Inferential Statistics <ul><li>Parametric : assume that the dependent variable has a known underlying mathematical distribution (normal, binomial, Poisson, etc.) which serves as the basis for sample-to-population estimates. Parametric tests are robust and have great power efficiency. </li></ul><ul><li>Non-parametric : do not assume a normal distribution ( distribution free ) & require that the data meet fewer assumptions. Allow for the analysis of a mixture of data types. </li></ul>
25. 25. T-Test <ul><li>Determine if there is a difference (in a characteristic) between two populations based on data from samples of those populations. </li></ul><ul><li>Requirements </li></ul><ul><ul><li>random sample </li></ul></ul><ul><ul><li>interval or ratio data </li></ul></ul><ul><ul><li>normal distribution </li></ul></ul><ul><ul><li>equal standard deviations </li></ul></ul>
26. 26. T-Test
27. 27. Regression <ul><li>Predicts values of a dependent variable based on values of independent (predictor) variables </li></ul><ul><li>Requirements : </li></ul><ul><ul><li>interval or ratio data </li></ul></ul><ul><ul><li>normal distribution </li></ul></ul><ul><ul><li>correlated variables </li></ul></ul><ul><ul><li>linear relationship </li></ul></ul>
28. 28. Regression
29. 29. ANOVA <ul><li>Determine if there are differences between three or more sample means. </li></ul><ul><li>Test the significance and direction of the difference. </li></ul><ul><li>Requirements : </li></ul><ul><ul><li>normal distribution (in each cell) </li></ul></ul><ul><ul><li>Interval or ratio data </li></ul></ul><ul><ul><li>homogeneity of variance </li></ul></ul>
30. 30. ANOVA
31. 31. Chi Square Test <ul><ul><li>Difference between expected and observed frequencies for nominal or ordinal data </li></ul></ul><ul><li>Requirements : </li></ul><ul><ul><li>Any type of data </li></ul></ul><ul><ul><li>Large sample size (>50) </li></ul></ul><ul><ul><li>Similar distributions </li></ul></ul>
32. 32. Chi Square Test Pepsi Challenge Observed : Pepsi 85, Coke 57, RC 78 Expected (equal) = 73.33 Degrees of freedom = rows - 1 = 3 - 1 = 2 Critical value of χ 2 = 5.99 at alpha = 0.05 Observed value of χ 2 = 5.8 Decision: Fail to reject H 0 5.8 χ 2 = 219.99 220 Totals 0.3 21.81 4.67 73.33 78 RC 3.64 266.67 -16.33 73.33 57 Coke 1.86 136.19 11.67 73.33 85 Pepsi (O-E) 2 /E (O-E) 2 O-E E O
33. 33. Inferential Statistics <ul><li>Poisson regression </li></ul><ul><li>Negative Binomial reg. </li></ul>OLS Regression Predict value from measured variables <ul><li>Wilcoxon test </li></ul><ul><li>Chi-Square </li></ul>T-test Compare sample to a hypothetical value <ul><li>Kruskal-Wallace test </li></ul><ul><li>Chi-square test </li></ul>ANOVA Compare 3+ unmatched groups <ul><li>Mann-Whitney </li></ul><ul><li>Komogorov-Smirnov </li></ul>Standard two-group t-test Compare 2 paired groups <ul><li>Mann-Whitney test </li></ul><ul><li>Fisher's test </li></ul>Unpaired t-test Compare 2 unpaired groups <ul><li>Spearman correlation </li></ul><ul><li>Kendall's tau </li></ul>Pearson correlation Quantify association between variables Non-parametric Parametric Goal
34. 34. Review: Research Design <ul><li>Research Question </li></ul><ul><ul><ul><li>What will the study answer? </li></ul></ul></ul><ul><li>Hypotheses </li></ul><ul><ul><ul><li>What do you think the results will be? </li></ul></ul></ul><ul><li>Data definitions </li></ul><ul><ul><ul><li>What scales are the variables, what is the distribution, and what are the dependent, independent & extraneous variables? </li></ul></ul></ul><ul><li>Data collection </li></ul><ul><ul><ul><li>What is the best method for collecting the variables of interest? </li></ul></ul></ul><ul><li>Data analysis </li></ul><ul><ul><ul><li>What are the proper statistical tests to use on the data? </li></ul></ul></ul><ul><li>Conclusions </li></ul><ul><ul><ul><li>What does the data show us or indicate? </li></ul></ul></ul>
35. 35. Case Studies <ul><li>Citation Analysis </li></ul><ul><ul><li>Antelman, K (2004) “Do Open-Access Articles Have a Greater Research Impact?” College & Research Libraries News 65(5):pp. 372-382 </li></ul></ul><ul><li>Usage Analysis </li></ul><ul><ul><li>Blecic, DD (1999) “Measurements of journal use: an analysis of the correlations between three methods.” Bull Med Libr Assoc 87(1): 20-25. </li></ul></ul><ul><li>Service Analysis </li></ul><ul><ul><li>Nichols, J; Shaffer, B; Shockey, K. (2003). “Changing the Face of Instruction: Is Online or In-class More Effective?” College & Research Libraries , 64:5: 378-389. </li></ul></ul>
36. 36. “ Changing the Face of Instruction…” Is an online tutorial as effective in teaching library instruction as a classroom setting? H3. Students will report as much or more satisfaction with online instruction as students taking traditional instruction. Research Question Hypotheses H1. Students will have higher scores in information literacy tests after library instruction. H2. Students will have the same or higher scores in info-lit tests after taking online tutorials as students taking traditional instruction.
37. 37. “ Changing the Face of Instruction…” Variables: Test scores & survey results Data Collection: Pretest/Posttest & Survey Variables & Data Collection Statistical Tests Conclusions Accept H1: Instruction improves literacy. Desc Stats incl. mean, standard deviation, standard error, T-tests (1 & 2 tailed) Accept H3 alternative hypothesis – Student satisfaction is equal with both methods. Accept H2 alternative hypothesis – Online has no significant difference from traditional.
38. 38. “ Do Open-Access Articles…” <ul><li>Research Question </li></ul><ul><li>Hypothesis </li></ul><ul><li>Variables and Data Collection </li></ul><ul><li>Statistical Tests </li></ul><ul><li>Conclusions </li></ul><ul><li>Critical Questions </li></ul>
39. 39. “ Do Open-Access Articles…” Do freely available articles have a greater research impact? Research impact: citation rates Open Access: freely available Research Question Hypotheses H1. Scholarly articles have a greater research impact if the articles are freely available online than if they are not. Ho: (null hypothesis): There is no difference between the mean citation rates: Ho: d1 = d0 Measures
40. 40. “ Do Open-Access Articles…” Variables: Mean citation rates Data Collection: At least 50 articles from 10 leading journals in 4 disciplines. Variables & Data Collection Statistical Tests Conclusions Reject Ho: Open Access articles are citation more than those that are not OA. Desc Stats incl. mean, standard deviation, standard error, Wilcoxon sign-rank Validity? Reliability of Measures? Generalizability? Alternate hypotheses? Discussion
41. 41. My favorite statistic… Baseball is 90% mental – the other half is physical.