Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

810 views

Published on

License: CC Attribution-ShareAlike License

No Downloads

Total views

810

On SlideShare

0

From Embeds

0

Number of Embeds

237

Shares

0

Downloads

48

Comments

0

Likes

1

No embeds

No notes for slide

- 1. BEST PRACTICES FOR STATISTICS
- 2. Know what you know and what you don’t know Have a comparison group Use validated measures Have a Data Entry Plan Get to know your data If it doesn’t fit, change it Place your bets before you collect the data Use the best methods of analysis for your question & your data Go beyond the p-value BEST PRACTICES
- 3. What is Statistics? •Study of Data •Collecting •Organizing •Summarizing •Analyzing •Presenting •Storing & Sharing Why is it Important? •Make sense of the data •Explain what happens and (possibly) why •Make sound decisions •To know how close we are to the truth.
- 4. Results Bias? Sampling Error? Invalid Measures? Random Error? Other Factors? PURPOSE OF STATISTICS
- 5. BEST PRACTICE: KNOW WHAT YOU ALREADY KNOW, WHAT YOU WANT TO KNOW AND WHAT YOU DON’T KNOW
- 6. How do users differ when (searching, finding, selecting) (articles, books, Web sites)? What are the effects of ___________On ____________? Whichis better at improving _________? How are people (finding, selecting, using) _______? What are factors associated with ___________? STARTING WITH YOUR RESEARCH QUESTION
- 7. KINDS OF VARIABLES Independent Subjects Factors Effects of… Dependent Objects Outcomes Effects on…
- 8. Nominal •Counts by category •No meaning between the categories (Blue is not better than Red) Ordinal •Ranks •Scales •Space between ranks is subjective Interval •Integers •No baseline •Space between values is equal and objective, but discrete Ratio •Interval data with a baseline •Space between is continuous LEVELS OF MEASUREMENT (NOIR)
- 9. •Counts by Categories •Ranks •Scales Qualitative •Measurements •Composite scores •Simple Counts Quantitative ANOTHER WAY
- 10. LIKERT-TYPE SCALE? Arbitrary Few Levels Individual Questions Ordinal? Symmetrical Many Levels Composite Score Interval?
- 11. BEST PRACTICE: HAVE A COMPARISON GROUP
- 12. WAYS OF COMPARING… Time Periods Other Libraries National Surveys Patron Types Material Types
- 13. •Qualitative •Comparison Expected ranks or ratios •Quantitative •Correlations Two variables •Quantitative or Qualitative •Paired or Not Paired Samples or Groups KINDS OF COMPARISON
- 14. BEST PRACTICE: USE A VALID MEASURE
- 15. Are you actually measuring what you are trying to measure? VALIDITY OF MEASURES
- 16. USE A TOOL WITH ESTABLISHED VALIDITY Approaches and Study Skills Inventory for Students (ASSIST) User Engagement Scale (UES)
- 17. ESTABLISH VALIDITY OF MEASURES •ConsistencyReliability •Common sense Content or Face Validity •Based on theory Construct Validity •Comparison with other valid measures Criterion Validity
- 18. BEST PRACTICE: HAVE A DATA PLAN
- 19. GOAL OF DATA COLLECTION IN STATISTICS Reliability Bias
- 20. BIAS Systematic (not random) deviation from the true value (Statistics.com) Selection Bias Measurement • Observer Bias • Non-response Bias Analysis Bias
- 21. DATA INPUT Have a data entry plan Train the inputters Use data validation tricks Double-entry
- 22. BEST PRACTICE: GET TO KNOW YOUR DATA
- 23. Central Tendency SpreadError EXPLORATORY DATA ANALYSIS
- 24. • Average • For Quantative data • Excel function: =Average(range) Mean • Middle • For Quantitative or Rank data • Excel function: =Median(range) Median • Most common • Primarily for Qualitative data • Excel function: =Mode(range) Mode MEASURES OF CENTRAL TENDENCY
- 25. SPREAD & DISTRIBUTION
- 26. DISTRIBUTION OR SPREAD OF QUALITATIVE DATA Tables •Counts •Percentages/Ratios •Averages of Counts Excel •Pivot Tables
- 27. PIVOT TABLES IN EXCEL Select Data •Highlight table •Insert->Pivot Table Select Variables •Categories (Row Labels) •Values Change Settings •Percentage of Grand Total •Average
- 28. DEMONSTRATION OF PIVOT TABLES FOR SPREAD OF QUALITATIVE DATA
- 29. GRAPH & CHART RULES OF THUMB Trends Connection across the X- axis Categorical Comparisons Grouped Stacked Relative Stacked Categorical Few Categories Differences are Wide
- 30. QUANTITATIVE DISTRIBUTIONS Stem & Leaf Histogram Distribution graphs
- 31. John W. Tukey Exploratory Data Analysis Examining your data visually. Stem & Leaf Hinges Box plots Scatter plots, etc. EXPLORATORY DATA ANALYSIS
- 32. STEM-AND-LEAF Stem Leaf 0 01112222222222222233333344445556 666677788899 1 0000000011122223333356778899 2 00122234444799 3 0245 First digit(s) Last digit Years at UNT 0 5 13 1 6 13 1 6 13 1 6 13 2 6 15 2 6 16 2 7 17 2 7 17 2 7 18 2 8 18 2 8 19 3 11 29 4 11 29 4 12 30 4 12 32 4 12 34 5 12 35 5 13
- 33. FROM STEM-AND-LEAF TO HISTOGRAMS
- 34. Stem Leaf Count 0 1122223334445555666666677777899 31 1 000011122222222333346677889 27 2 0122234468 10 3 1112355888 11 4 12 2 Range Count 0-9 31 10-19 27 20-29 10 30-39 11 40-49 2 0 10 20 30 40 0-9 10-19 20-29 30-39 40-49 Histogram of Years at UNT
- 35. HISTOGRAMS IN EXCEL •Options •Add-ins •Manage Add-ins Analysis Toolpak •Equal Size Ranges •Ceiling (“more”) Set ranges •Data •Data Analysis •Histogram Create Histogram •Insert Bar Chart •Highlight histogram •Select bars & Format Selection •Gap Width=0% Create Graph For Histogram 9 19 29 39 49
- 36. DEMONSTRATION OF HISTOGRAM IN EXCEL
- 37. SPREAD OF QUANTITATIVE DATA How variable is the data? Range Quantiles Standard Deviation
- 38. RANGE & QUARTILES
- 39. Box plots Median Upper & lower quartiles Outliers PRESENTATION OF SPREAD
- 40. Measure of dispersion of data Square root of the average variation from the mean STANDARD DEVIATION
- 41. Greater variation, less certainty Lower variation, more certainty WHAT DOES THE SD TELL YOU?
- 42. •Min(range) •Max(range) Range •Percentiles.inc(range, %) •Quartile.inc(range, {1,2,3,4}) Quantiles •STDEV.S(range) Standard Deviation SPREAD IN EXCEL
- 43. NORMAL DISTRIBUTION
- 44. SKEWED DISTRIBUTIONS
- 45. DEMONSTRATION OF DISTRIBUTIONS Distribution of the Population The “Truth” N is the # of samples n is the number of items in each sample Watch the cumulative mean & medians slowly merge to the population
- 46. Transform ation of data BEST PRACTICE: IF IT DOESN’T FIT, CHANGE IT
- 47. WHY TRANSFORM? 0 5 10 15 20 25 30 35 40 45 50 0-9 10-19 20-29 30-39 Years at UNT 0 2 4 6 8 10 12 14 16 Log10(Years at UNT)
- 48. Y=a+bx Log(Y)=Log(a+bx) 1/Y = 1/(a+bx) HOW TRANSFORMATION WORKS
- 49. Evaluate the distribution of raw data Select a transformation method Transform the data Normally Distributed? Statistically Test Transformed Data HOW TO BECOME NORMAL Express the result in the terms of the transformation
- 50. BEST PRACTICE: PLACE YOUR BETS BEFORE YOU START
- 51. INFERENTIAL STATISTICS Tests of hypotheses •Associations •Expectations Accounts for uncertainty •Random error •Confidence interval
- 52. Your Hypothesis (H1) Null Hypothesis (H0) HYPOTHESIS TESTING
- 53. EXAMPLE HYPOTHESIS >=75%* <75%* *…of journal articles cited by UNT PACS faculty in journal articles published between 2008-2011. UNT Libraries provides access to…
- 54. p Sample Size Central Tendency SpreadDistribution Significance Level HYPOTHESIS TESTING
- 55. TESTING HYPOTHESES
- 56. BEST PRACTICE: CHOOSE THE BEST METHOD FOR YOUR QUESTION AND DATA
- 57. Assumptions Limitations Appropriate data type What the test tests KNOW THE TESTS
- 58. Variable Type What is being compared Independence of units Underlying variance in the population Distribution Sample size Number of comparison groups FACTORS ASSOCIATED WITH CHOICE OF STATISTICAL METHOD
- 59. USE A FLOW CHART
- 60. BEST PRACTICE: GOING BEYOND THE P- VALUE
- 61. AND THE P-VALUE SAYS… Much about the distributions More about the H0 than H1 Little about size of differences
- 62. MORE USEFUL STATISTICS Effect Sizes •Tell the real story Confidence Intervals •State your certainty
- 63. Correlations •Cohen’s guidelines for Pearson’s r Differences from the mean •Standardized •weighted against the standard deviation •Cohen’s d 𝑑 = 𝑥1 − 𝑥2 𝑠 EFFECT SIZES OF QUANTITATIVE DATA Effect Size r> Small .10 Medium .30 Large .50
- 64. Based on Contingency table • Odds of event A divided by odds of event B • Case-control studies Odds ratio • Uses probabilities rather than odds • Experiments, RCTsRelative risk EFFECT SIZES OF QUALITATIVE DATA Test A/B Yes No Total Yes 10 15 25 No 50 25 75 Totals 60 40 100
- 65. Point estimates Intervals Based on Expressed as: •Single value •Mean •Degree of uncertainty •Range of certainty around the point estimate •Point estimate (e.g. mean) •Confidence level (usually .95) •Standard deviation •The mean score of the students who had the IL training was 83.5 with a 95% CI of 78.3 and 89.4. CONFIDENCE INTERVALS
- 66. Noise Signal STATISTICAL ANALYSIS
- 67. Know what you know and what you don’t know Have a comparison group Use validated measures Have a Data Entry Plan Get to know your data If it doesn’t fit, change it Place your bets before you collect the data Use the best methods of analysis for your question & your data Go beyond the p-value BEST PRACTICES
- 68. RESOURCES Rice Virtual Lab in Statistics Excel Tutorials for Statistical Analysis Khan Academy - videos Basic Research Methods for Librarians Descriptive Statistical Techniques for Librarians

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment