Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Extraction


Published on

Cochrane Review author training workshop, January 22-23, 2009 at the University of Calgary Health Sciences Centre

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

Data Extraction

  1. 1. Data Extraction <ul><li>Handbook, Chapter 7 </li></ul>
  2. 2. Data Collection Form <ul><li>Importance </li></ul><ul><li>Linked directly to review question and planned assessment </li></ul><ul><li>Historical record of decisions through the review process </li></ul><ul><li>Data repository, from where the analysis will emerge </li></ul><ul><li>Vary across reviews, but fundamental components </li></ul>
  3. 3. Components of Data Form <ul><li>Paper vs electronic forms -> Chapter 7.5.2 for considerations </li></ul><ul><li>Consider how much information to collect (too much vs too little) </li></ul><ul><li>Careful thought and planning </li></ul><ul><li>Logical to entry into RevMan, especially for electronic forms </li></ul>
  4. 4. Components of Data Form (continued) <ul><li>Items to include: </li></ul><ul><ul><li>Review title, author name, who collecting data </li></ul></ul><ul><ul><li>Review unique ID, Study ID (RevMan), unique report ID (multiple reports) </li></ul></ul><ul><ul><li>Date (for multiple chronologic versions) </li></ul></ul><ul><ul><li>Notes section (up front) – use for RevMan </li></ul></ul><ul><ul><li>Verification of study eligibility </li></ul></ul><ul><ul><li> Table of excluded studies and reasons required </li></ul></ul><ul><ul><li>Study characteristics, information to assess bias, results </li></ul></ul><ul><ul><li>Accurate coding: instructions and decision rules, use of ‘Not reported’ and ‘Unclear’ </li></ul></ul>
  5. 5. Details for Protocol <ul><li>Data categories to be collected </li></ul><ul><ul><li>Again, the study is the unit of interest </li></ul></ul><ul><li>Who and how many for collection </li></ul><ul><ul><li>2 people, independent </li></ul></ul><ul><ul><li>Content and non-content expert? </li></ul></ul><ul><li>Piloting, training, existence of coding instructions for the data form </li></ul><ul><ul><li>Piloting for information and instructions </li></ul></ul><ul><ul><li>Kappa not routinely done, but can and only for the most important data </li></ul></ul>
  6. 6. Details for Protocol (continued) <ul><li>How data extracted for multiple reports of the same study </li></ul><ul><ul><li>How to collate: CONSORT flow diagrams may help </li></ul></ul><ul><li>How disagreements handled </li></ul><ul><ul><li>Process (consensus -> arbitrator -> study authors -> report disagreement in review) </li></ul></ul><ul><li>Blinding to aspects of study reports not generally recommended </li></ul>
  7. 7. Data Form: Process <ul><li>Think – what are the needs? </li></ul><ul><li>Design – draft form </li></ul><ul><li>Pilot – sample of papers, compare completed forms </li></ul><ul><li>Refine – modify form, instructions </li></ul><ul><li>Extract – further revisions may be needed once data extraction underway, this is okay </li></ul><ul><li>Consider retraining or recoding with passage of time, also for updating </li></ul>
  8. 8. Characteristics of Included Studies <ul><li>Extracted data: </li></ul><ul><ul><li>Methods </li></ul></ul><ul><ul><li>Participants </li></ul></ul><ul><ul><li>Interventions </li></ul></ul><ul><ul><li>Comparisons </li></ul></ul><ul><ul><li>Outcomes </li></ul></ul><ul><ul><li>Information for risk of bias (later) </li></ul></ul><ul><ul><li>Notes </li></ul></ul>
  9. 9. Characteristics of Included Studies
  10. 10. Characteristics of Included Studies
  11. 11. Characteristics of Included Studies
  12. 12. Table of Excluded Studies
  13. 13. Studies Awaiting Classification
  14. 14. Table of Ongoing Studies
  15. 15. Data Types <ul><li>Dichotomous </li></ul><ul><li>Continuous </li></ul><ul><li>Ordinal </li></ul><ul><li>Counts and rates </li></ul><ul><li>Time-to-event </li></ul>
  16. 16. Dichotomous (binary) Outcomes <ul><li>The outcome is one of two possibilities, cannot be both e.g., pregnant vs. not </li></ul>Intervention Event No event Event No event Control a+b = n I c+d = n C <ul><li>Data required: ‘event’ and ‘no event’ for </li></ul><ul><li>each group </li></ul>d c b a
  17. 17. Dichotomous outcomes (continued) <ul><li>May experience difficulties with: </li></ul><ul><ul><li>Poor reporting </li></ul></ul><ul><ul><li>Numbers may need to be derived from percentage data provided in the report (which denominator to use, compatibility with more than one numerator) </li></ul></ul><ul><li>Sometimes ordered categorical data (ordinal) are treated as dichotomous data </li></ul>
  18. 18. Continuous outcomes <ul><li>Outcomes that can take any value in a specific range eg, weight, length of stay </li></ul><ul><li>Sometimes data from ordered categories (ordinal) are treated as continuous </li></ul><ul><li>Check if data can be treated as continuous (Consult CRG statistician) </li></ul><ul><li>Effect measures: mean difference (difference in means) or standardized mean difference (factors in standard deviation) </li></ul>
  19. 19. Ordinal outcomes <ul><li>Categories with a natural order </li></ul><ul><li>Can range in number of categories </li></ul><ul><li>Measurement scales: important to know if validated </li></ul><ul><li>Discussion on how to analyze: Section 9.2.4 </li></ul><ul><li>Analyze as dichotomous? Continuous? As is? Consult CRG statistician </li></ul><ul><li>Extract data in all forms in which reported </li></ul>
  20. 20. Counts of events <ul><li>Events that can happen more than once to a </li></ul><ul><li>given individual </li></ul><ul><li>Eg, MI, adverse event </li></ul><ul><li>Common vs. rare events </li></ul><ul><li>Different methods exist for analysis </li></ul><ul><li>Consult CRG statistician </li></ul><ul><li>Extract data in the form they are reported in </li></ul>
  21. 21. Time-to-event outcomes <ul><li>Analysis of whether the event occurred and when </li></ul><ul><li>‘ Survival data’ in statistics </li></ul><ul><li>E.g., survival, disease recurrence </li></ul><ul><li>For each individual: </li></ul><ul><ul><li>‘no event’ period </li></ul></ul><ul><ul><li>at the end of that period, whether event occurred or is just the end of observation (censored) </li></ul></ul><ul><li>Hazard ratio the most appropriate effect measure </li></ul><ul><li>Methods of meta-analysis Section 9.4.9 ; consult CRG statistician </li></ul>
  22. 22. Planning Your Analysis <ul><li>Handbook, Chapter 9 </li></ul>
  23. 23. Planning Your Analysis <ul><li>Specify comparisons </li></ul><ul><li>First and most important step! </li></ul><ul><li>Back to PICO – should relate clearly and directly </li></ul>
  24. 24. What Comparisons are Important? <ul><li>Pair-wise comparisons </li></ul><ul><ul><li>Glucosamine versus placebo </li></ul></ul><ul><ul><li>Glucosamine versus no therapy </li></ul></ul><ul><ul><li>Glucosamine versus NSAIDs </li></ul></ul><ul><ul><li>Acupuncture plus vitamin B12 injections versus B12 injections alone </li></ul></ul><ul><li>Specify the main comparisons in the protocol </li></ul><ul><li>If need to modify in light of the data (eg,new comparison)…document! </li></ul>
  25. 25. Characteristics of Intervention and Control <ul><li>Are the interventions or controls all the same? </li></ul><ul><li>Different types of drugs used (eg inhaled steroids? NSAIDs?) </li></ul><ul><li>Different dosages/duration of therapy/preparation </li></ul><ul><li>If not the same, are they similar enough to be combined? </li></ul>
  26. 26. Population Description <ul><li>Separate or combine? </li></ul><ul><li>E.g. Mild versus severe rheumatoid arthritis </li></ul><ul><li>Age issues? Defining separation points? </li></ul>
  27. 27. Outcomes <ul><li>Determine what to combine – shouldn’t be too diverse </li></ul><ul><li>Start with the outcomes that are considered most important (specified in protocol) </li></ul><ul><li>Eg, mortality, pain, function </li></ul><ul><li>May be same variable, but could be categorical in one study and continuous in another </li></ul><ul><li>Eg, pain – continuous: 0-10 mmVAS, 1-5 Likert </li></ul><ul><li>dichotomous: ‘moderate’, ‘severe’ </li></ul><ul><li>RevMan: outcomes are entered after the comparisons have been set up </li></ul>
  28. 28. Hierarchy of Data and Analysis Section <ul><li>Results of studies </li></ul><ul><li>Tabular </li></ul><ul><li>Fixed format </li></ul><ul><li>Forest plots automatically generated </li></ul>
  29. 29. Sample Table Hyperlink to Forest plot
  30. 30. Link out to Forest Plot
  31. 31. Make a Plan <ul><li>Sketch it out on paper </li></ul><ul><li>Decide what you want to do before you start entering into RevMan </li></ul><ul><li>Document changes between the protocol and conducting the review </li></ul>
  32. 32. Critical Elements <ul><li>Input from more than one person, including “expert” </li></ul><ul><li>Transparency; state post-hoc decisions </li></ul><ul><li>Clear, consistent </li></ul>
  33. 33. Data extraction exercise