Thank you all for participating in this series. It’s good to see you here, to see you interested in learning more about statistics.
With this series, I’m hoping to increase your comfort and decrease your fears when approaching statistics.
Towards that end, you should be more comfortable with the foundations of statistics, descriptive and inferential statistics and reading & interpreting statistics in literature.
In this Foundational session, we will examine more closely what statistics actually is and why it is important for research. Statistics is not limited to the formulas and analytical methods you glossed over. It is the study of data throughout the research lifecycle. All of these aspects are important because it helps us make sense of the data that is collected by enabling us to explain what happened and, to some extent, why. This, in turn, helps us make sound decisions. And we can back all of that up because we can determine and demonstrate how close we are to the truth.
The purpose of statistics, as I see it, is to check against all the factors that could affect your results.
Because statistics is the study of data throughout the research lifecycle, you need to be thinking about data from the start of any research project.
Your research question will be the foundation of your study. Every other decision should be made to answer this question. This is probably the simplest of all research questions – Other questions may take the form of…
For example, when the PACS faculty gave us the lowest LibQUAL ratings, we asked if the problem was with the collections. Specifically, did we have what they used, based on their citations or references in their published works?
After you’ve set your research question, you’ll need to think about the variables in your study. When it comes to the research questions, there are essentially two kinds of variables. Independent vars are the subjects of your RQ – or other factors. These are what you refer to in the “Effects of…” part of the RQ. Conversely, the Dependent vars are those which are the object of your question – the outcomes of interest – the “Effects on…” part.
For the PACS citation analysis study, we looked at these variables….Which of these do you think is the DV?
While the variables are the pots that represent the factors of interest, the data are the values that go into these pots. The data can be categorized as such….The scale of data you use will determine the statistical method you will use in analysis.
I wanted to talk just a bit about the Likert-type scales, because these are so often used in LIS research. There is some dispute as to exactly what type of data a Likert-type scale is, and therefore, how the data should be presented and analyzed. Many treat Likert-type scales as ordinal data, b/c the values are arbitrary, and the space between values is subjective. This would be particularly true if there were few levels and the scale was limited to individual questions. This limits the statistical methods that would be appropriate for analyzing rank data. However, when the scale is symmetrical (and thus the space between is more objective), there are many levels (10 or more), and the responses are combined into a composite score, then many people think that it could be treated like an interval. While I won’t go anymore into details on this issue today, it is something you should be aware of.
Now, looking again at the PACS study variables, what kind of data would you think Department is?....
Most important lesson I learned in my epidemiological training – ask yourself, compared to what? Display a single monthly circ stat (1,234) - Is this good? Is it bad? What does it mean? How can we make sense of it? (hopefully, they'll bring up the idea of comparisons)
So, all of these factors – the RQ, the data type, and the comparison group – go into determining what statistical methods to use.
I’d like to discuss now the validity of measures
Your measures can take these various forms – counts…; The unit of analysis is what the individual “thing” in your study – person, books, article, use, etc. One thing to consider is if your dependent & independent variables are on the same or similar level. You should be careful not to associate a higher-level IV (such as the literacy of a country or state) with a lower-level DV (like individual usage of the library). This can lead to an “economic fallacy”.
To ensure that you obtain valid results, you should try to use a measure that has already had its validity established. You can find these in the published literature. Here are just a few examples of LIS measures with established validity.
However, if such a measure cannot be found, you should make an effort to examine validity of what you create yourself using psychometric methods. In fact, even if you use an established measure, you should demonstrate the validity of that measure on your own sample.…
This is an illustration often used to demonstrate reliability & validity. Consider the target to cover the area of interest, and the shots to be where your measure hits. In the upper-left, your measure is both not reliable (b/c the shots are spread out) and not valid (b/c the shots only cover the top quadrants). To the right, the shots cover all quadrants (so it’s valid), but the shots are spread out. Conversely, in the lower-left, the reliability is good (all shots grouped together), but it’s not valid b/c it only covers one quadrant and misses the center altogether. Finally, the lower-right target is both reliable and valid.
By using valid measures, we can be more certain that our results are valid. It also helps check agains bias and the influence of other factors.
So how does sampling affect statistics?
Youcould take a census and measure all of the population of interest. It may be hard, but you get pretty much the truth. Often, though, this is not possible. So you use a sample. While easier, you only get an estimate of the truth.
Which you use depends on your research question. If your population of interest is limited, then a census may be warranted. But be careful to not extrapolate results from a census to a broader or different population. The effects of Information Literacy instruction may be different on graduate students than undergrads.Your population should fit your research question.
For the PACS study, we first gatehred all items published by PACS faculty during 2008-2011. But we only looked at the journal articles published. Then we only looked at the journal articles cited in the articles published.So, is this a census or a sample? Remember the research question – Does the UNT Library provide access to what the PACS faculty use, based on the citations in their published works? While we looked at all journal article citations, we only looked at citations in journal articles published. You could consider this a non-random sample.
…this is like tossing a coin or rolling dice. We know what the chances are (50-50; 1/6th).
The importance of randomness to your study depends on your research question. Random or weighted random samples should result in a sample that is similar to the population. This makes it more valid to generalize to that population. Because the sample is random, you can use inferential statistics to determine the probability that your result was due to chance, which makes it most useful for testing hypotheses.Non-random (or convenience or purposive) samples may, but more likely may not be similar to your population of interest. But you can often get more details and deeper information than using random sample methods. So it’s most useful for generating hypotheses, that can be tested on random samples.
You can see how statistics helps us check our results against sampling error and bias
How does data collection affect statistics
The primary objective of data collection is toincrease reliability and decrease bias.
Types of biasThese can broadly be divided into three categories -1) Selection biasThe selection of subjects into your sample or their allocation to treatment group produces a sample that is not representative of the population, or treatment groups that are systematically different. Random selection and random allocation are the keys to avoiding this bias.2) Measurement biasMeasurement of outcomes is inaccurate. This may be due to inaccuracy in the measurement instrument or bias in the expectations of study participants, carers or researchers. The latter may be addressed by blinding participants, carers or researchers.3) Analysis biasThe protection against bias created by randomisation will only be maintained if all participants remain in the group to which they were allocated and complete follow up. Participant who change groups, withdraw from the study or are lost to follow up may be systematically different from those who complete the study. Analysis bias can be reduced by maximising follow up and carrying out an intention to treat analysis.
Bias can results from variations in data collection, including those who do not respond to surveys, sending email invitations to a wide variety of lists or individuals, inconsistencies in the observers or recorders of the data, poor definitions that make it hard for the observers to be consistent, and using non-randomly selected time frames (such as taking surveys every morning at 10).
Data collection forms should fit the variables and the population. If there are many or complex variables, then the form should record data for one unit at a time. This is typically done with surveys. Conversely, if there are few variables or the data can be collected all at once (e.g. download), then a spreadsheet would be appropriate.
To help increase reliability, you should think about data collection and input very carefully. You should have a data entry plan, train the people doing the data input, use data validation tricks in software such as SurveyMonkey or Access, and use double-entry, where the data is entered twice by different people and checked for consistency.
To further improve reliability, the best method of recording and organizing data is to have one unit of analysis per row. Thus, you can use Excel for basic data analysis, but other statistics tools are useful, as well.Please, do not use Word for recording and collecting data.
Here is the spreadsheet I used for the PACS citation analysis study. Notice that all fo the measures are on one row.
Data collection can help check against bias, invalid measures and sampling errors.
Now we get to the meat of this session – this is what you probably think of when you here “statistics”.
There are three essential elements of any statistical analysis – CT, spread & error
Broadly speaking, there are two levels of statistical analysis.Descriptive is the most common form – the analysis describes the sample and compares it with the population. Whereas inferential statistics infers associations and is used to test hypotheses.
Summarizes data collectedDescribes the sample and compares it with the populationUsually univariate – looking at one variable at a time.Measures of:Central TendencySpread
The optimal measures of central tendency depend largely on the data type…
The spread is the variation of the data that you collected. For interval & ratio data, the spread is best demonstrated by the range, percentiles and standard deviation. For nominal & rank data, the best methods for demonstrating spread is with tables and bar graphs.
The range is merely the difference between the highest & lowest values. The quartiles are the points at which 25% of the population is above or below. You could also use quintiles (20%), or deciles (10%) points.Given what we learned about central tendency, which measure CT would be the same as the 50% percentile?
Don’t let this formula frighten you. It’s merely the square root of the average distance from the mean. We’ll talk about this more in the next session.
For interval & ratio data, spread is often presented in the form of a box-plot or box-and-whiskers plot. The box is formed by the upper & lower quartiles (25th percentiles), with a line to show where the median is. The whiskers are the lowest points that are not outliers (defined here as 1 ½ times the lowest or highest points). And then outliers are the dots. This is a visually-compact and efficient way to demonstrate spread of interval or ratio data.
Spread of nominal data is best represented by cross-tab tables. Notice that percentages across rows are presented here.
Bar graphs are also good at demonstrated spread of nominal or rank data. This first one is a basic chart. But this second one is more interesting. It presents three variables: the categories of songs that were popular, the time periods in which they were popular and the widths of each column visually represent the volume of songs in each time period.
As mentioned before, inferential statistical methods are used to test hypotheses. Hypotheses are essentially expectations, bets. In fact, man’s current understanding of probability & statistics was developed from gambling. One of the oldest forms of entertainment, our current ideas about probability were developed to better understand games of chance.So, a hypothesis is like a bet that must be placed before the study get’s started. Now, it’s not just a hunch – it’s an educated guess, based on previous experience and documented studies.Inferential statistics also accounts for random errors (variations from the true distribution). This is usually demonstrated with a confidence interval. The CI is a statement about the range and certainty of the results.
What you may or may not know is that what is actually tested is not your hypothesis but the Null Hypothese (H0). The H0 is essentially the inverse of your hypothesis. It is what your hypothesis is not.
In the case of the PACS citation analysis study, we did not have a comparison group, so we used an estimate of what we considered optimal access. Our hypothesis was that UNT provided access to at least 75% of the journal articles cited in the articles published by PACS faculty between 2008-2011. The H0, then is that UNT provided access to less than 75% of articles cited. And this is what was tested, NOT the H1. I’ll go into the reasons for this in the session on inferential statistics. But I wanted to make sure you knew this subtle difference.
…is influenced by many factors, including…
Another way of thinking about how statistics helps in research is this way: Statistics helps reduce the noise and increase the signal that the data is trying to communicate. Noise is the error or differences from the population mean due to random events or chance. Noice could also come from the effects of other factors that you may or may not have measured. If you did include these other factors, you can use inferential statistics to control for these effects. We’ll cover that in the 3rd session.
The purpose of statistics, as I see it, is to check against all the factors that could affect your results.
Statistics for Librarians, Session 1: What is statistics & Why is it important?
Series ObjectivesFoundations Descriptive StatisticsInferential StatisticsReading & InterpretingStatisticsComfortLevel
What is Statistics?•Study of Data•Collecting•Organizing•Summarizing•Analyzing•Presenting•Storing & SharingWhy is it Important?•Make sense of thedata•Explain whathappens and(possibly) why•Make sounddecisions•To know how closewe are to the truth.
ResultsBias?SamplingError?InvalidMeasures?RandomError?OtherFactors?Purpose of Statistics
Start with your Research QuestionHow do users differwhen(searching, finding, selecting)(articles, books, Websites)?What are the effects of ___________On ____________?Which is better atimproving_________?How are people (finding, selecting, using) _______?What are factors associatedwith ___________?
Example of Research QuestionPACS• Low LibQUAL+RatingsCollections •Is it our collections?Do we havewhat theyuse?•Basedoncitations
Example of Variables•Department•Years at UNTFaculty•# published by typePublished•# cited by type•UNT accessibleCitedIVDV
Scales of Data (NOIR)Nominal•Counts by category•Binary (Yes/No)•No meaning betweenthe categories (Blue isnot better than Red)Ordinal•Ranks•Scales•Space between ranksis subjectiveInterval•Integers•No baseline•Space between valuesis equal andobjective, but discreteRatio•Interval data with abaseline•Space between iscontinuous
Are you actuallymeasuring what you aretrying to measure?
Selecting Measures•Counts•Survey responses•Grades/Scores•Ranks•Scales (e.g. Likert)•Age, Length of Time•Frequency•People•Books•Articles•Uses•Levels of Analysis• What is the object (DV)?• What is the subject (IV)?Measures Units of Analysis
Use a tool with established validityApproaches and Study SkillsInventory for Students(ASSIST)User Engagement Scale (UES)
Establish Validity of Measures• ConsistencyReliability• Corresponds with expectations• Common understandingsContentValidity• Corresponds with othervariables based on theoryConstructValidity• Corresponds with othermeasuresCriterionValidity
When to Use Which:Research Question?Census•Book usage at UNT Libraries•Effects of IL instruction onEnglish 1100 studentsSample•Book usage at all libraries•Effects of IL instruction onall students
Example - Census or Sample?AlljournalarticlescitedAll Items Published byPACS FacultyAll journalarticlespublishedby PACSfaculty
Random Samples• Every Unit of Analysis has an equal and known chance ofbeing included.
Importance of RandomnessRandomSamplesRandom, Weighted, etc.Should be representativeof populationCan use inferentialstatisticsMost useful for testinghypothesesNon-RandomSamplesConvenience, Purposive, etc.May or may not berepresentative of populationUse descriptive statistics onlyMost useful for generatinghypotheses