Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Crowdsourcing Documentation in Software Engineering


Published on

Presented at ICSE 2014 Workshop on Crowdsourcing in Software Engineering June 2, 2014, Hyderabad India.

Published in: Software
  • Be the first to comment

Crowdsourcing Documentation in Software Engineering

  1. 1. Crowdsourcing Documentation in Software Engineering Margaret-Anne (Peggy) Storey ICSE 2014 1st International Workshop on Crowdsourcing in Software Engineering
  2. 2. Christoph Treude Brendan Cleary Fernando Figueira Filho Jamie Starke Gargi Bougie Peter Rigby Lars Grammel Leif Singer Laura MacLeod Daniel German Alexey Zagalsky Chris Parnin, Georgia Tech Ohad Barzilay, Tel-Aviv University, Israel Arie van Deursen, TU Delft, the Netherlands Li-Te Cheng, IBM Research Ian Bull, Eclipsesource Acknowledgements
  3. 3. “Documentation is the castor oil of software development” Gerald Weinberg, Psychology of Computer Programming 1975
  4. 4. Documentation to capture… Requirements Architecture Features, implementation Scenarios of use Examples of use Testing Decisions And more?
  5. 5. Created by… Developers, contributors Documenters Automatically generated Users The crowd! Designed for… End users Client developers Contributors
  6. 6. Documentation rationale… To replace communication To specify a contract with partners To provide organizational memory To reflect To seek feedback For the public good! [Wasko et al.]
  7. 7. Documentation formats… Formal documentation (hierarchically structured) Technical articles Books Self documenting code Source code comments Forums Email lists Usenet Issues, bug tracking Archived chats Wikis Blog posts, microblogs Tagging Stackoverflow Videos, podcasts Community portals (aggregate channels)
  8. 8. Documentation challenges… Navigability, discoverability Audience and “fit for purpose” Boring prose Consistent use of terminology Staying current Costly, slow Explicit versus tacit knowledge Lack of good examples
  9. 9. Crowdsourcing… “…obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees or suppliers… the work comes from an undefined public rather than being commissioned from a specific, named group… Explicit crowdsourcing lets users work together to evaluate, share and build different specific tasks, while implicit crowdsourcing means that users solve a problem as a side effect of something else they are doing.” [Wikipedia, June 1, 2014]
  10. 10. Community versus crowd contributions? Individual or team contributions (e.g. design documents, podcasts) Community contributions: created by a few (e.g. translation efforts) Crowdsourcing contributions: many small contributions that add value (e.g. views, likes, comments, tags, votes)
  11. 11. Social production [Yochai Benkler] Industrial revolution, high costs to access broadcast media Low cost distributed small contributions at scale Not just turning levers but adding wisdom, creativity Not a fad! Critical long term shift caused by the internet
  12. 12. Social media as a disruptive force: an enabler for crowdsourcing Enhancing the participatory culture in software development and in software documentation Storey, M.-A., L. Singer, F. Figueira Filho, B. Cleary and A. Zagalsky, The (R)evolutionary Role of Social Media in Software Engineering, ICSE 2014 Future of Software Engineering Track), Hyderabad, 2014.
  13. 13. Social Media Channels for Software Documentation Community Portal Tagging Microblogging Question & Answer Websites Videos, podcasts Blogging Wikis
  14. 14. Outline of the rest of this talk Some insights on how social media channels can support “crowdsourced” documentation in software development Discussion
  15. 15. Community Portals Tagging MicroBlogging Question & Answer Websites Videos, podcasts Blogging Wikis
  16. 16. Wikis Wikis for documenting Software
  17. 17. Wikis and software documentation Used extensively (requirements, design, planning), integrated with many tools Some shortcomings: lack of authoritativeness [Dagenais and Robillard FSE 2010] Designed by Ward Cunningham in 1994
  18. 18. Community Portals Question & Answer Websites Videos, podcasts Tagging Wikis MicroBlogging Blogging
  19. 19. Social Tagging How does tagging help with crowdsourced software documentation?
  20. 20. TagSEA: Tagging Waypoints in source code and gathering into Tours M.-A. Storey, J. Ryall, J. Singer, D. Myers, L.-T. Cheng, M. Muller, 2009. How Software Developers Use Tagging to Support Reminding and Refinding. IEEE Transactions on Software Engineering (TSE), 2009.
  21. 21. Tagging in Studied introduction and adoption of tags by several teams for work items C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.
  22. 22. Tagging in Findings: –  Categorization (cross cutting concerns, see also Martin Robillard’s Feat tool) –  Organization –  Finding and refinding
  23. 23. ConcernLines Treude, C., and M.-A. Storey, Concernlines: A timeline view of co-occurring concerns, formal research demonstration, IEEE ICSE’09.
  24. 24. Question & Answer Websites Tagging MicroBlogging Community Portals Videos, podcasts Wikis Blogging
  25. 25. Microblogging Why do developers tweet?
  26. 26. Microblogging Software engineers tweet actively (share) facts about software engineering topics and technology G. Bougie, J. Starke, M.-A. Storey and D. German. Towards Understanding Twitter Use in Software Engineering: Preliminary Findings Ongoing Challenges and Future QuestionsIn Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering. 2011.
  27. 27. Survey/Interviews/Survey Findings: – Awareness – Learning – Relationships “It was evolving way faster than I was able to keep up with it. And the only way to keep up was to follow some Node.js people on Twitter.” Leif Singer, Fernando Figueira Filho, Margaret-Anne Storey. Software Engineering at the Speed of Light: How Developers Stay Current Using Twitter ICSE 2014.
  28. 28. Question & Answer Websites Tagging MicroBlogging Blogging Community Portal Videos, podcasts Wikis
  29. 29. Blogging Why do developers blog?
  30. 30. Blogging Determining requirements through blogs [Park and Maurer, CHASE 2009] How developers blog: high-level concept discussion and requirements [Pagano and Maalej, MSR 2011] Blogs play a role in documenting APIs [Treude and Parnin, Web2SE 2011] Is there potential to increase the size of the Blogging crowd for software documentation?
  31. 31. Question & Answer Websites Tagging MicroBlogging Blogging Community Portal Videos, podcasts Wikis
  32. 32. Question and Answer Websites What role do Question and Answer websites play in documentation?
  33. 33. Over 92% of the questions on Stackoverflow are answered, and for those 92% the median answer time is 11 minutes L. Mamykina, B. Manoim, M. Mittal, G. Hripcsak, and B. Hartmann. Design lessons from the fastest q&a site in the west. CHI 2011.
  34. 34. Stackoverflow How-to questions prevalent, and used frequently by novices C. Treude, O. Barzilay and M.-A. Storey. How do Programmers Ask and Answer Questions on the Web? NIER/ICSE 2011.
  35. 35. Linking Stackoverflow data with API usage C. Parnin, C. Treude, L. Grammel and M.-A. Storey. Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”. Under submission, blogged (50,000 hits) at documentation/ May 2012.
  36. 36. Stackoverflow as Crowd Documentation Coverage of API documentation: 77% of the Java API classes & 87% of Android API classes Speed of coverage:
  37. 37. Impact on documentation tools? Automatically generating documentation Visualizing crowd documentation
  38. 38. Community Portals, Question & Answer Websites Videos, podcasts Tagging Wikis MicroBlogging Blogging
  39. 39. How do Developers use YouTube to Share Knowledge? Videos, podcasts
  40. 40. 44
  41. 41. Developer motivations? Documentation! But also … Reputation: Improves their online persona Dedication to helping others “What I wish I had known when I started” Efficiency “Throw it up on the internet and forget about it”
  42. 42. Implications Many projects use videos to support documentation and onboarding (e.g. MSDN) so… How can they be improved for the recipient? How effective are videos at sharing tacit knowledge? Tool enhancements? Integration with IDE? [e.g. Tours] Cheng, L.-T., M. Desmond and M.-A. Storey, “Presentations by Programmers for Programmers”, ICSE 2007, IEEE 29th International Conference on Software Engineering.
  43. 43. Is this crowdsourcing? Are code walkthroughs on YouTube effective? How much do the social features matter? A social platform for crowd input for video documentation?
  44. 44. Question & Answer Websites Tagging MicroBlogging/ Blogging Community Portal Videos, podcasts Blogging Wikis
  45. 45. Community portals Stores code and project resources Provides version control Hosts web pages Connects people Links to communication tools Records interactions
  46. 46. C. Treude and M.-A. Storey. Effective Communication of Software Development Knowledge Through Community Portals. ESEC/FSE ’11.
  47. 47. Implications of different media Content on wikis is often stale, but useful for posting information quickly Blog posts create more buzz or fanfare Official product documentation is trusted (review it carefully or rely on the crowd?) Have an updating process (or crowdsource it?) Have mechanisms to solicit feedback (e.g. commenting, blog posts, voting)
  48. 48. Social Media Channels to support Software Documentation Community Portal Tagging Microblogging Question & Answer Websites Videos, podcasts Blogging Wikis
  49. 49. Discussion
  50. 50. Documentation challenges revisited Recommenders to aid in discoverability Keeping up: leverage the crowd Incentive: participatory culture Video and podcasts for tacit knowledge Mining of social media can point to code examples (implicit mechanism)
  51. 51. Discussion points When does a community become a crowd? Gaps and nichification? Incentives? Dynamics? Study other portals, hubs? Do these mechanisms translate to industry? What do you see as challenges, opportunities for involving the crowd?
  52. 52. @thechiselgroup, @margaretstorey Funded by NSERC/DRDC/IBM