James Howison
(with Kevin Crowston)
Collaboration through
Open Superposition:
A theory of the open
source way
CC Credit: http://www.flickr.com/photos/baggis/
Big Data in Biology Symposium
U Texas at Austin 11 May 2016
Work supported by the NSF
03-41475, 04–14468, 05-27457 and 07–08437
@jameshowison
“Let’s do this the open source way?”
Sounds great, right?
Lots of people volunteering for the enjoyment of it,
working together, sharing stuff, meritocracy, contributing
stuff, fighting the man, all without raising money or top-
down management.
Open innovation, open platforms, open
hardware, open data, open government,
open NASA, citizen science …
@jameshowison
http://www.flickr.com/photos/gubatron/31024
@jameshowison
CC Credit: http://www.flickr.com/photos/ejpphoto/
@jameshowison
CCCredit:http://www.flickr.com/photos/kojihachisu/
@jameshowison
A research arc
• Participant Observation
– one case
– participation and observation
• Replication
– two cases chosen by replication logic
– Study of project archives
• Candidate theory development
– develop candidate theory and demonstrate it’s
usefulness
@jameshowison
Goal:
An image of FLOSS production
CC Credit: http://flickr.com/photos/anthea/
@jameshowison
Discovery through
Participant Observation
@jameshowison
Task: The Container Column
@jameshowison
How it was built
@jameshowison
BibDesk 2.0?
@jameshowison
CC Credit:
http://www.flickr.com/photos/ws
l-libdev/5140646741/
@jameshowison
Task: “Web Groups”
https://sourceforge.net/mailarchive/message.php?msg_id=DF0FB757-56BA-45D7-A1EA-262EB7A5B3DC@mac.com
June 2003 (Email)
I really want to use this, but the conditions have never quite
been right - either I was waiting for … RSS+RDF (now
looks like it'll never happen) or … an XML bibliographic file
format … (could happen now, but I ran out of free time).
@jameshowison
What didn’t happen
Image Credit:
TreeGrid.com marketing materials
@jameshowison
Task: “Web Groups”
https://sourceforge.net/mailarchive/message.php?msg_id=DF0FB757-56BA-45D7-A1EA-262EB7A5B3DC@mac.com
https://sourceforge.net/mailarchive/message.php?msg_name=7394DD78-A02E-11D7-AFC1-0003931E45D0%40mac.com
June 2003 (Email)
I really want to use this, but the conditions have never quite been right - either I was
waiting for … RSS+RDF (now looks like it'll never happen) or … an XML bibliographic file
format … (could happen now, but I ran out of free time).
Jan 2007 (Email with patch):
It was much easier than I expected it to be because
the existing groups code (and search groups code)
was very easy to extend. Kudos - I wouldn't have
tried it if so much hadn't already been solved well.
Thanks!
Discovery Findings
1. Individual work with personal
motivations
2. Superposition of layers
3. Productive Deferral
CC Credit: http://flickr.com/photos/jvk/
@jameshowison
But that’s just one case!
(and what’s the point of theorizing
about idiosyncratic situations?)
@jameshowison
To the
Archives!
The evidence is here, somewhere.
CC Credit:
http://www.flickr.com/
photos/hamadryades/
@jameshowison
Replication: Fire and Gaim
• Specific RQs:
– What proportion of work was individual?
– Any evidence of “productive deferral”?
• Fire and Gaim
– Multi-protocol instant messaging clients
– Community-based open source
– Similar task and collaboration infrastructure to
BibDesk
@jameshowison
Illustrative Co-work
@jameshowison
Illustrative Individual Work
30 (of 106) tasks consisted of a single Action: Core Production@jameshowison
Tasks were individual
@jameshowison
Evidence for Deferral
@jameshowison
An image of FLOSS production:
Open Superposition
• Work is done in Tasks that are
– Individual
– Short
– Layered
• Complex work is often deferred
– Until it is easier (doesn’t always happen!)
Other types of work build on this base
@jameshowison
To be explained
1. Why are these patterns of work
observed?
2. How can complex software result from
this way of working?
3. Under what socio-technical
contingencies is this likely to be
successful?
@jameshowison
Why these patterns of
individual work and deferral?
• Fewest dependencies, lowest
coordination challenges and costs
• Closest match to motivational situation
of FLOSS participants.
– Increases autonomy without eliminating
relatedness
• Ke and Zhang (2010), Ryan and Deci (2000)
@jameshowison
Ok, but can this really work?
• Software development is highly
complex, interdependent, work
(e.g., Herbsleb et al. 2001))
• Can such simple steps really get the job
done?
@jameshowison
Imagine trying to plan this
1. Identify desired outcomes (design)
2. Design a task sequence that reaches them
3. Find people who are:
– Motivated to do each task
– Able to do each task
– At just the right time
Crippling search costs!
@jameshowison
Application-led search
• Openness and availability of application
• Task identification through situated use
(e.g., Suchman 1987)
“Porches fill in by stages, not all at once, you know. ... it
happens that way because [the family] can always
visualize the next stage based on what’s already
there”
(Brand 1995, quoting an architect)
@jameshowison
But why does deferral make
things easier?
• Layered tasks makes deferral more
likely to be productive
• Small layers can compose in different
ways. They provide option value.
(e.g., Baldwin and Clark 2001)
• Small layers are easier to understand,
especially over time.
(e.g., Dabbish, 2011; Boudreau at al 2011)
@jameshowison
Contingencies for
Open superposition
• Attributes of object of work
– Layerability
– Low instantiation costs
– Low distribution costs
• Irrevocable openness
• Time
@jameshowison
Layers vs Steps
CC Credit:
http://www.flickr.com/photos/18378305@N00/742
6136724/
CC Credit:
http://www.flickr.com/photos/jrnoded/2
997160501/
@jameshowison
Irrevocable openness
Free and Open Source Licenses prevent this.
CC Credit:
http://www.flickr.com/photos/bantam1
0/5637893667/
@jameshowison
Time == Money
CC Credit:
http://www.flickr.com/photos/opacity/1
600562651/
This guy hates to wait
@jameshowison
Takeaways for scientific
software development
Howison, J., & Herbsleb, J. D. (2013). Incentives and Integration in Scientific Software Production. In Proceedings of the
2013 Conference on Computer Supported Cooperative Work (pp. 459–470). New York, NY, USA: ACM.
http://doi.org/10.1145/2441776.2441828
Takeaways for science software
• Orientation to community
– Are they users or potential contributors?
• Be prepared to wait
– Easy to run ahead of potential contributors
• Avoiding downsides of “teamwork”
– Can you keep tasks small and
motivationally independent? No blocking.
• How to search for the next step?
– How are ideas retained? When revisited?
@jameshowison
Open Superposition
Howison, J., & Crowston, K. (2014). Collaboration through open superposition:
A theory of the open source way. MIS Quarterly, 38(1), 29–50.
@jameshowison

Open Superposition and lessons for scientific software development

  • 1.
    James Howison (with KevinCrowston) Collaboration through Open Superposition: A theory of the open source way CC Credit: http://www.flickr.com/photos/baggis/ Big Data in Biology Symposium U Texas at Austin 11 May 2016 Work supported by the NSF 03-41475, 04–14468, 05-27457 and 07–08437 @jameshowison
  • 2.
    “Let’s do thisthe open source way?” Sounds great, right? Lots of people volunteering for the enjoyment of it, working together, sharing stuff, meritocracy, contributing stuff, fighting the man, all without raising money or top- down management. Open innovation, open platforms, open hardware, open data, open government, open NASA, citizen science … @jameshowison
  • 3.
  • 4.
  • 5.
  • 6.
    A research arc •Participant Observation – one case – participation and observation • Replication – two cases chosen by replication logic – Study of project archives • Candidate theory development – develop candidate theory and demonstrate it’s usefulness @jameshowison
  • 7.
    Goal: An image ofFLOSS production CC Credit: http://flickr.com/photos/anthea/ @jameshowison
  • 8.
  • 9.
    Task: The ContainerColumn @jameshowison
  • 10.
    How it wasbuilt @jameshowison
  • 11.
  • 12.
  • 13.
    Task: “Web Groups” https://sourceforge.net/mailarchive/message.php?msg_id=DF0FB757-56BA-45D7-A1EA-262EB7A5B3DC@mac.com June2003 (Email) I really want to use this, but the conditions have never quite been right - either I was waiting for … RSS+RDF (now looks like it'll never happen) or … an XML bibliographic file format … (could happen now, but I ran out of free time). @jameshowison
  • 14.
    What didn’t happen ImageCredit: TreeGrid.com marketing materials @jameshowison
  • 15.
    Task: “Web Groups” https://sourceforge.net/mailarchive/message.php?msg_id=DF0FB757-56BA-45D7-A1EA-262EB7A5B3DC@mac.com https://sourceforge.net/mailarchive/message.php?msg_name=7394DD78-A02E-11D7-AFC1-0003931E45D0%40mac.com June2003 (Email) I really want to use this, but the conditions have never quite been right - either I was waiting for … RSS+RDF (now looks like it'll never happen) or … an XML bibliographic file format … (could happen now, but I ran out of free time). Jan 2007 (Email with patch): It was much easier than I expected it to be because the existing groups code (and search groups code) was very easy to extend. Kudos - I wouldn't have tried it if so much hadn't already been solved well. Thanks!
  • 16.
    Discovery Findings 1. Individualwork with personal motivations 2. Superposition of layers 3. Productive Deferral CC Credit: http://flickr.com/photos/jvk/ @jameshowison
  • 17.
    But that’s justone case! (and what’s the point of theorizing about idiosyncratic situations?) @jameshowison
  • 18.
    To the Archives! The evidenceis here, somewhere. CC Credit: http://www.flickr.com/ photos/hamadryades/ @jameshowison
  • 19.
    Replication: Fire andGaim • Specific RQs: – What proportion of work was individual? – Any evidence of “productive deferral”? • Fire and Gaim – Multi-protocol instant messaging clients – Community-based open source – Similar task and collaboration infrastructure to BibDesk @jameshowison
  • 20.
  • 21.
    Illustrative Individual Work 30(of 106) tasks consisted of a single Action: Core Production@jameshowison
  • 22.
  • 23.
  • 24.
    An image ofFLOSS production: Open Superposition • Work is done in Tasks that are – Individual – Short – Layered • Complex work is often deferred – Until it is easier (doesn’t always happen!) Other types of work build on this base @jameshowison
  • 25.
    To be explained 1.Why are these patterns of work observed? 2. How can complex software result from this way of working? 3. Under what socio-technical contingencies is this likely to be successful? @jameshowison
  • 26.
    Why these patternsof individual work and deferral? • Fewest dependencies, lowest coordination challenges and costs • Closest match to motivational situation of FLOSS participants. – Increases autonomy without eliminating relatedness • Ke and Zhang (2010), Ryan and Deci (2000) @jameshowison
  • 27.
    Ok, but canthis really work? • Software development is highly complex, interdependent, work (e.g., Herbsleb et al. 2001)) • Can such simple steps really get the job done? @jameshowison
  • 28.
    Imagine trying toplan this 1. Identify desired outcomes (design) 2. Design a task sequence that reaches them 3. Find people who are: – Motivated to do each task – Able to do each task – At just the right time Crippling search costs! @jameshowison
  • 29.
    Application-led search • Opennessand availability of application • Task identification through situated use (e.g., Suchman 1987) “Porches fill in by stages, not all at once, you know. ... it happens that way because [the family] can always visualize the next stage based on what’s already there” (Brand 1995, quoting an architect) @jameshowison
  • 30.
    But why doesdeferral make things easier? • Layered tasks makes deferral more likely to be productive • Small layers can compose in different ways. They provide option value. (e.g., Baldwin and Clark 2001) • Small layers are easier to understand, especially over time. (e.g., Dabbish, 2011; Boudreau at al 2011) @jameshowison
  • 31.
    Contingencies for Open superposition •Attributes of object of work – Layerability – Low instantiation costs – Low distribution costs • Irrevocable openness • Time @jameshowison
  • 32.
    Layers vs Steps CCCredit: http://www.flickr.com/photos/18378305@N00/742 6136724/ CC Credit: http://www.flickr.com/photos/jrnoded/2 997160501/ @jameshowison
  • 33.
    Irrevocable openness Free andOpen Source Licenses prevent this. CC Credit: http://www.flickr.com/photos/bantam1 0/5637893667/ @jameshowison
  • 34.
    Time == Money CCCredit: http://www.flickr.com/photos/opacity/1 600562651/ This guy hates to wait @jameshowison
  • 35.
  • 36.
    Howison, J., &Herbsleb, J. D. (2013). Incentives and Integration in Scientific Software Production. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work (pp. 459–470). New York, NY, USA: ACM. http://doi.org/10.1145/2441776.2441828
  • 37.
    Takeaways for sciencesoftware • Orientation to community – Are they users or potential contributors? • Be prepared to wait – Easy to run ahead of potential contributors • Avoiding downsides of “teamwork” – Can you keep tasks small and motivationally independent? No blocking. • How to search for the next step? – How are ideas retained? When revisited? @jameshowison
  • 38.
    Open Superposition Howison, J.,& Crowston, K. (2014). Collaboration through open superposition: A theory of the open source way. MIS Quarterly, 38(1), 29–50. @jameshowison

Editor's Notes

  • #4 Has this guy figured out the delicate dance of working well at distance, on highly interdependent work, all while working with unreliable volunteers?
  • #5 Turn this … into
  • #6 This (Gaudi’s masterpiece) … lots of different ways to do this. Individually Teammode -> concurrently, sequentially Accretively/Stigmergically? Lots of study of these modes in organization science, including greats like Simon, Mintzberg and Thompson.
  • #7 Need to return to these.
  • #9 First contact through the application itself. Using it in my day to day as a doctoral student.
  • #11 Tasks tended to be primarily undertaken by an individual programmer in a relatively short period of time at the developer’s own behest, motivation, and timing. Point of this is to show how work typically gets done. To give a narrative image of Solo Work.
  • #13 Roslyn Public Library, WA. http://www.flickr.com/photos/wsl-libdev/5140646741/ When you pull out the foundation things stop working. Big job, what didn’t happen was a work breakdown
  • #14 The founder emphasized that the task had become “much easier” in the intervening years because of the incremental layered work of other developers; work undertaken for other features that just happened to also support Web Groups. The work taken while the task was languishing had prepared the ground so that a developer working alone in a matter of days could complete a feature that earlier had been too much work to even begin.
  • #15 They didn’t break the task down into components, assign them to people and bring things back together in the end.
  • #17 End of Slide Timing: 7 minutes
  • #20 Alone or Together? End of Slide Timing: 11 minutes
  • #21 2 mins
  • #22 2 mins
  • #23 This plus Illustrations: 5 minutes
  • #24 The early Actions in these Tasks were all coded as Support, usually feature requests or posts by non-developers demonstrating that a feature was desirable (light squares in Figure 3). Close inspection shows that all the production work (dark triangles in Figure 3) for these tasks was completed relatively quickly at the end of the task, during the release period, even on those tasks that had been outstanding for months
  • #25 End of Slide Timing: 20
  • #29 Williamson 1981
  • #30 Tends to throw up tasks that are ready made for the sequence and come from developers with motivations (or can be communicated quickly and thus motivate a developer).
  • #32 Layers, in this sense, are different from generic steps because each layer creates an (adequately) finished artifact.
  • #34 Foundations don’t work well when someone can pull a component out. FLOSS licenses prevent this. f a contributor is free to remove their layers, then all subsequent work is not superposition but a special kind of co-work (because the layers are not motivationally independent): each layer depends on continued non-revocation of its foundation, a long-term personal interdependency
  • #35 Productive deferral and open task search take time. Investment undermines ability to wait.
  • #38 Why? - extra work to share - expectation of future reward (software, like data, can facilitate discoveries want to be there for that.) - Longer cycles of development and use (getting the full paper done) mean that code bases diverge; hard to re-integrate. Longer cycles of development and contribution How can projects "harvest" improvements at the end of publications