RepliCHI - 8 Challenges in Replicating a Study
Upcoming SlideShare
Loading in...5
×
 

RepliCHI - 8 Challenges in Replicating a Study

on

  • 354 views

A presentation of the 8 challenges we experienced when 6 novice MSC students tried to replicate a study of collaborative information seeking as part of their methods class, and how the original ...

A presentation of the 8 challenges we experienced when 6 novice MSC students tried to replicate a study of collaborative information seeking as part of their methods class, and how the original authors supported the process.

Statistics

Views

Total Views
354
Views on SlideShare
346
Embed Views
8

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 8

https://twitter.com 8

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

RepliCHI - 8 Challenges in Replicating a Study RepliCHI - 8 Challenges in Replicating a Study Presentation Transcript

  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Teaching HCI Methods: Replicatinga Study of Collaborative SearchMax L.WilsonEvaluating the synergic effect of collaboration in information seeking.In SIGIR 2011by Shah & González-IbáñezReplicated Paper:Wednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Background• I was teaching HCI Methods in Swansea• Replicating a study is “a good way to learn”• I’d just finished teaching the Information Seeking module- and I’m very interested in Collaborative Information SeekingWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/http://coagmento.orgWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Chirag Shah• Assistant Prof at Rutgers LIS• Built Coagmento during his PhD• Now Working with his PhD Student•Wednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Collaborative Information SeekingWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Study Conditionsrelevant information, and using it to compose a report.6. Participants filled out post-task questionnaires.The researcher conducting the study communicated with theparticipants through the chat-box at different times during thestudy instructing them to start/stop the task or fill in aquestionnaire.3.4 ConditionsTo study the difference between individual information seekingand CIS, as well as to understand how various CIS settings canaffect a collaborative team’s effectiveness in accomplishing aninformation-seeking task, we conducted experiments with fourdifferent conditions: single participants, two participants at thesame computer, two participants in the same room but differentcomputers, and two participants in different rooms with individualcomputers.In order to have a baseline to study the synergic effect ofcollaboration, we artificially created pairs of users from C1 (singleusers). We generated all possible combinations of pairs in groupsof 5, reaching a total of 49 groups and creating 245 artificialteams in total. This was done in order to cover all possible pairs ofusers while avoiding a given user appearing in more than oneteam within the same group of teams.These five conditions are summarized in Table 1. Setups for fourof these conditions are also depicted in Figure 3. Note that in thereal experiment, those in C5 condition were located in differentrooms separated by walls, and not just a partition. They could notsee or talk to each other directly, and the only communicationchannel they had was the text-box provided with the system.Table 1: Experimental conditions.Cond. DescriptionC1 Single participantsC2 Artificial teamC3 Co-located using the same computerC4 Co-located using different computersC5 Remotely locatedYour report on this topic should address the following issues:description of how the oil spill took place, reactions by BP as wellas various government and other agencies, impact on economyand life (people and animals) in the gulf, attempts to fix theleaking well and to clean the waters, long-term implications andlessons learned.”The participants saw this description on the screen (phase 4 in thestudy), and were also given a printed copy to refer to during theirsession.Figure 3: Experimental setups for four different conditions.4. EVALUATIONIn order to evaluate the effectiveness of the participants in variousconditions, we employed a number of traditional and non-traditional evaluation measures, which are presented below. Herewe also describe other useful constructs and definitions that willn order to have a baseline to study the synergic effect ofollaboration, we artificially created pairs of users from C1 (singlesers). We generated all possible combinations of pairs in groupsf 5, reaching a total of 49 groups and creating 245 artificialeams in total. This was done in order to cover all possible pairs ofsers while avoiding a given user appearing in more than oneeam within the same group of teams.hese five conditions are summarized in Table 1. Setups for fourf these conditions are also depicted in Figure 3. Note that in theeal experiment, those in C5 condition were located in differentooms separated by walls, and not just a partition. They could notee or talk to each other directly, and the only communicationhannel they had was the text-box provided with the system.Table 1: Experimental conditions.Cond. DescriptionC1 Single participantsC2 Artificial teamC3 Co-located using the same computerC4 Co-located using different computersC5 Remotely located.5 TaskWe chose “gulf oil spill” as the topic for this experimentationnce it was quite popular and relevant at the time the study waseing conducted. Our preliminary investigations, including a fewilot runs, indicated that there was a huge amount of material onhis topic, and that the participants would find it interesting andhallenging enough as an exploratory search task. Eacharticipant was given the following task description.Figure 3: Experimental setups for four differen4. EVALUATIONIn order to evaluate the effectiveness of the participconditions, we employed a number of traditiotraditional evaluation measures, which are presentewe also describe other useful constructs and definilater be used while reporting and discussing the resu4.1 Universe of webpagesIn order to compute quantities such as coverage,universal set of webpages. Given that the search dexperiments was the open web, we needed a morthat we could use to compare with. We decided toof all the webpages visited by all of our participantother words, the universe of webpages was definedWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/What they found• Remote collaborators were more independent (less overlap), andmore synergetic than random pairs• Significant differences between conditions• Across several measures- page diversity- page coverage- relevance (precision, recall, F-measure)- page usefulness• No difference in NasaTLXWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Why was this a good paper to Replicate?• 1) Coagmento was a downloadable tool• 2)The study is clearly reported in the paper• 3) I know Chirag relatively well• 4)The study used more than the basic metricsWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Many Challenges• 1) Software• 2) Data capture• 3)Task Design• 4)Team Research Experience• 5) Financial Support• 6)Time Scales• 7) Data Processing• 8) Data AnalysisWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #1• Coagmento had evolved/improved (after this study)• BUT - Roberto offered to try to roll-back the software-This created a small project delay (hard for teaching)- But meant we were using more comparable software• BUT - Sadly this process was only semi-successfulSoftwareVersionsWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #2• The raw data was from the servers• We began to consider recording the screens and manuallycreating the logs• BUT - Roberto offered to create a new server instance• And provided us with a zip of the data at the end!Data CaptureWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #3• A leading newspaper has hired your team to create a comprehensive reporton the causes, effects, and consequences of the recent gulf oil spill.As a part ofyour contract, you are required to collect all the relevant information from anyavailable online sources that you can find.• To prepare ... [prompts to use features of the software]• Your report on this topic should address the following issues: description ofhow the oil spill took place, reactions by BP as well as various government andother agencies, impact on economy and life (people and animals) in the gulf,attempts to fix the leaking well and to clean the waters, long-term implicationsand lessons learned.Task DesignWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #3• Temporally irrelevant• Culturally less relevant?- less intrinsic motivation for participants• Should we create a new one, or use the original one?Task DesignWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #3• A leading newspaper has hired your team to create a comprehensive reporton the causes, effects, and consequences of the recent gulf oil spill.As a part ofyour contract, you are required to collect all the relevant information from anyavailable online sources that you can find.• To prepare ... [prompts to use features of the software]• Your report on this topic should address the following issues: description ofhow the oil spill took place, reactions by BP as well as various government andother agencies, impact on economy and life (people and animals) in the gulf,attempts to fix the leaking well and to clean the waters, long-term implicationsand lessons learned.Task DesignWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #3• A leading newspaper has hired your team to create a comprehensive reporton the causes, effects, and consequences of the Olympic Games.As a partof your contract, you are required to collect all the relevant information fromany available online sources that you can find.• To prepare ... [prompts to use features of the software]• Your report on this topic should address the following issues: Impact oneconomy of host countries (people and animals), long-termimplications on the host country, conditions and votingpolicy to become hosting nation and the next host countryand their preparations to host the games.Task DesignWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #4• 6 Novice MSc students, rather than 1 solid PhD student- each taking several modules at the time• Potential for high variance between students• Tried to create anchors - like a fixed script etc• Not clear how important the variance would beResearchTeamWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #5• Original participants received $15 (x70) each• And prizes for the best collaborating teams• We had no budget- managed £50 of prizes• Decided the prize for best team being most important toreplicateFinancial IncentivesWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #6• Had to be within a fixed module (a semester)• They had other modules to work on• We had some time-slips in setting up the study• We were only able to run 20 pairs, rather than 30 pairsTime LimitationsWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #7• This was very interesting• For example - they removed search result pages- this can never be an explicit known set- especially as the task domain was different• We followed their principles for data processing for analysis• But we could not be sure we did this the same wayData ProcessingWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Challenge #8• The exact stages of statistical analysis are not always clear• For example - they used a modified NASATLX- consequently, the exact analysis was not clear- in particular, as to whether pair-wise comparisons were made• Also, as the scales were likert, and the stats reported as ANOVA- we weren’t sure if ANOVA on Ranks was used- or a traditional ANOVA• (also the novice students didnt know the difference)Data AnalysisWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/The Outcome• Is almost irrelevant - but we did not find the same results• They found remote collaborators to be more independent, butmore synergetic than random pairs• There were so many potential reasons though- smaller sample size- different task difficulty- different software performance- different financial incentives- novice researchersWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/RepliCHI discussion points• 1) How should we handle different software versions?• 2) Should we be using original tasks?• 3) How to support data processing for future researchers?• 4) Is there community value from replicating as teaching?Wednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Many Challenges• 1) Software• 2) Data capture• 3)Task Design• 4)Team Research Experience• 5) Financial Support• 6)Time Scales• 7) Data Processing• 8) Data AnalysisWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Many Challenges• 1) Software• 2) Data capture• 3)Task Design• 4)Team Research Experience• 5) Financial Support• 6)Time Scales• 7) Data Processing• 8) Data AnalysisGeneral ReplicationIssuesWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Many Challenges• 1) Software• 2) Data capture• 3)Task Design• 4)Team Research Experience• 5) Financial Support• 6)Time Scales• 7) Data Processing• 8) Data AnalysisGeneral ReplicationIssuesReplication forTeaching IssuesWednesday, 8 May 13
  • Dr Max L.Wilson http://cs.nott.ac.uk/~mlw/Many Challenges• 1) Software• 2) Data capture• 3)Task Design• 4)Team Research Experience• 5) Financial Support• 6)Time Scales• 7) Data Processing• 8) Data AnalysisGeneral ReplicationIssuesReplication forTeaching IssuesPublishing IssuesWednesday, 8 May 13