Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Statistics for Librarians, Session 1: What is statistics & Why is it important?


Published on

First of 4 sessions introducing statistics to librarians and library staff.

Published in: Education, Technology
  • Be the first to comment

Statistics for Librarians, Session 1: What is statistics & Why is it important?

  1. 1. Why is it important?WHAT IS STATISTICS?
  2. 2. Goals of SeriesComfortFears
  3. 3. Series ObjectivesFoundations Descriptive StatisticsInferential StatisticsReading & InterpretingStatisticsComfortLevel
  4. 4. What is Statistics?•Study of Data•Collecting•Organizing•Summarizing•Analyzing•Presenting•Storing & SharingWhy is it Important?•Make sense of thedata•Explain whathappens and(possibly) why•Make sounddecisions•To know how closewe are to the truth.
  5. 5. ResultsBias?SamplingError?InvalidMeasures?RandomError?OtherFactors?Purpose of Statistics
  6. 6. Thinking about Data in your Research Project
  7. 7. Start with your Research QuestionHow do users differwhen(searching, finding, selecting)(articles, books, Websites)?What are the effects of ___________On ____________?Which is better atimproving_________?How are people (finding, selecting, using) _______?What are factors associatedwith ___________?
  8. 8. Example of Research QuestionPACS• Low LibQUAL+RatingsCollections •Is it our collections?Do we havewhat theyuse?•Basedoncitations
  9. 9. VariablesIndependentSubjectsFactorsEffectsof…DependentObjectsOutcomesEffectson…
  10. 10. Example of Variables•Department•Years at UNTFaculty•# published by typePublished•# cited by type•UNT accessibleCitedIVDV
  11. 11. Scales of Data (NOIR)Nominal•Counts by category•Binary (Yes/No)•No meaning betweenthe categories (Blue isnot better than Red)Ordinal•Ranks•Scales•Space between ranksis subjectiveInterval•Integers•No baseline•Space between valuesis equal andobjective, but discreteRatio•Interval data with abaseline•Space between iscontinuous
  12. 12. Likert-Type Scale?ArbitraryFew LevelsIndividualQuestionsOrdinal?SymmetricalMany LevelsCompositeScoreInterval?
  13. 13. Example of Variable Types•Department•Years at UNTFaculty•# published by typePublished•# cited by type•UNT accessibleCitedNNNNI
  14. 14. Compared to What?BookCirculations 180,354
  15. 15. Compared by…Time PeriodsOther LibrariesNational SurveysPatron TypesMaterial Types
  16. 16. ResearchQuestionData TypeComparisonGroupStatisticalMethodsUsed
  18. 18. Are you actuallymeasuring what you aretrying to measure?
  19. 19. Selecting Measures•Counts•Survey responses•Grades/Scores•Ranks•Scales (e.g. Likert)•Age, Length of Time•Frequency•People•Books•Articles•Uses•Levels of Analysis• What is the object (DV)?• What is the subject (IV)?Measures Units of Analysis
  20. 20. Use a tool with established validityApproaches and Study SkillsInventory for Students(ASSIST)User Engagement Scale (UES)
  21. 21. Establish Validity of Measures• ConsistencyReliability• Corresponds with expectations• Common understandingsContentValidity• Corresponds with othervariables based on theoryConstructValidity• Corresponds with othermeasuresCriterionValidity
  22. 22. Image: © Nevit Dilmen found at Wikimedia commons
  23. 23. ResultsBias?InvalidMeasures?SamplingError?RandomError?OtherFactors?
  25. 25. Allmembers ofpopulationHard tomeasureThe TruthCensusA selectionof thepopulationEasier tomeasureAn estimateof the truthSample
  26. 26. When to Use Which:Research Question?Census•Book usage at UNT Libraries•Effects of IL instruction onEnglish 1100 studentsSample•Book usage at all libraries•Effects of IL instruction onall students
  27. 27. Example - Census or Sample?AlljournalarticlescitedAll Items Published byPACS FacultyAll journalarticlespublishedby PACSfaculty
  28. 28. Random Samples• Every Unit of Analysis has an equal and known chance ofbeing included.
  29. 29. Importance of RandomnessRandomSamplesRandom, Weighted, etc.Should be representativeof populationCan use inferentialstatisticsMost useful for testinghypothesesNon-RandomSamplesConvenience, Purposive, etc.May or may not berepresentative of populationUse descriptive statistics onlyMost useful for generatinghypotheses
  30. 30. ResultsBias?InvalidMeasures?SamplingError?RandomError?OtherFactors?
  32. 32. Goal of Data Collection inStatisticsReliabilityBias
  33. 33. BiasSystematic (not random) deviation from thetrue value ( BiasMeasurement•Observer Bias•Non-response BiasAnalysis Bias
  34. 34. Data Collection FormsMany orComplexVariablesSurveys1 UnitPerForm FewerVariablesCollected allat onceBibliometricSpaceSurveysSpread-sheet
  35. 35. Data InputHave a data entry planTrain the inputtersUse data validation tricksDouble-entry
  36. 36. Organizing DataOne Unit of Analysis per Row
  37. 37. Example Spreadsheets
  38. 38. ResultsBias?InvalidMeasures?SamplingError?RandomError?OtherFactors?
  40. 40. CentralTendencyErrorSpreadElements ofStatistical Analysis
  41. 41. Inferential•InferassociationsDescriptive•Describe
  42. 42. Descriptive AnalysisJust the Facts, Ma’amSummarizesTablesChartsUnivariateOne variableat a timeComparisonwithPopulationDemonstrates howrandom the sample is
  43. 43. Measures ofCentral Tendency•AverageMean•MiddleMedian•Most CommonMode
  44. 44. Central Tendency by ScalesInterval orRatioMeanMedianNominal orRankMedianMode
  45. 45. SpreadInterval &Ratio•Range•Quartiles orQuintiles•StandardDeviationNominal &Rank•DistributionTables•Bar GraphsHow variable is the data?
  46. 46. Range &Quartiles
  47. 47. Standard Deviation•Measure of dispersion of data•Square root of the average variationfrom the mean
  48. 48. What does the Standard Deviation tell you?Greatervariation, lesscertaintyLowervariation, morecertainty
  49. 49. Presentation ofSpread•Box plots•Mean•Upper & lowerquintiles•Outliers•Cross-tabulations•Bar graphs
  50. 50. Spread of Nominal data
  51. 51. Bar graphs & plots
  52. 52. Inferential StatisticsTests of hypotheses•Associations•ExpectationsAccounts for uncertainty•Random error•Confidence interval
  53. 53. HypothesesYourHypothesis(H1)NullHypothesis(H0)
  54. 54. Example Hypothesis>=75%* <75%**…of journal articles cited by UNT PACS faculty in journal articlespublished between 2008-2011.UNT Libraries provides access to…
  55. 55. HypothesisTestingpSample SizeCentralTendencySpreadDistributionSignificanceLevel
  56. 56. Statistical AnalysisNoiseSignal
  57. 57. ResultsBias?SamplingError?InvalidMeasures?RandomError?OtherFactors?Purpose of Statistics
  58. 58. Valid•Measures•Data Collection•Sample Selection•Statistical MethodsValid•Data•Sample•Statistical AnalysisValid•ResultsRole ofValidity inResearch
  59. 59. ResourcesRice Virtual Lab inStatisticsExcel Tutorials forStatistical AnalysisKhan Academy -videosBasic ResearchMethods forLibrarians – ebookDescriptive StatisticalTechniques forLibrarians - ebook