Audio+slide video is posted at http://margaretannestorey.wordpress.com.
Slides from a Keynote at Mining Software Repository Conference 2012, co-located with ICSE 2012 in Zurich, Switzerland.
Towards the Social Programmer (MSR 2012 Keynote by M. Storey)
1. The Evolution of the Social Programmer
Social Media and Software Engineering
Margaret-Anne (Peggy) Storey
Keynote for MSR 2012, Zurich, Switzerland
University of Victoria, Victoria, BC Canada
4. CHISEL group, UVic, Canada:
– Christoph Treude
– Brendan Cleary
– Fernando Figueira Filho
– Jamie Starke
– Gargi Bougie
– Peter Rigby
– Lars Grammel
Chris Parnin, Georgia Tech, USA
Leif Singer, Leibniz Universität, Germany
Ohad Barzilay, Tel-Aviv University, Israel
Daniel German, UVic, Canada
Arie van Deursen, TU Delft, the Netherlands
Li-Te Cheng, IBM Research
5. Software Goals
repositories
“Software repositories such as source control systems, archived
communications between project personnel, and defect tracking systems are
used to help manage the progress of software projects. Software practitioners
and researchers are recognizing the benefits of mining this information to
support the maintenance of software systems, improve software design/reuse,
and empirically validate novel ideas and techniques.” MSR CFPs 2004-2012
6. Roadmap
Broaden goals of MSR
Redefine software repository to include
social media
Explore the impact of social media on
software engineering
Suggest how future MSR research may play a role
in emerging practices and software ecosystems
13. Space
Place
P. Dourish and V. Bellotti. Awareness and Coordination in Shared Workspaces. Proceedings of the ACM
Conference on Computer-Supported Cooperative Work (CSCW'92).
16. "We shape our tools and thereafter our tools shape us",
Laws of Media by Marshall McLuhan
http://www.youtube.com/watch?v=A7GvQdDQv8g
17. McLuhan Quotes:
The medium is the message. 1958
It is the framework which changes with
each new technology and not just the
picture within the frame. 1955
There are many reasons why most people prefer to live in
the age just behind them. It's safer. To live right on the
shooting line, right on the frontier of change, is terrifying.
1970
19. What role is social media playing
in Software Engineering?
M.-A. Storey, C. Treude, A. van Deursen and L.-T. Cheng.
The Impact of Social Media on Software Engineering Practices and Tools. In FoSER ’10: Proceedings of the FSE/
SDP workshop on Future of software engineering research.
20. Social Media Channels in
Software Engineering
Source code
comments
Reputation
Tagging
Wikis, social
networking, etc.
Question &
Answer Websites Microblogging
Blogging
21. Research methods used
Studies to inform tool designs and software practices
Mixed methods:
– Mining and analysis of
software artifacts
– Ethnographic observations
– Interviews
– Surveys
22. Source code
comments
Reputation Tagging
Wikis, social
networking, etc.
Question &
Answer Websites MicroBlogging
Blogging
23. Source code comments
How programmers use source comments for
communicating with developers?
24. Source Code Comments:
Graffiti or Information?
e.g. comments,
bookmarks, tasks, etc
The role of annotations in
program comprehension
r human
i. e. fo
us age
25. Marginalia, by
H. J. Jackson 2001
Fermat: "I have a truly marvellous
proof of this proposition which this
margin is too narrow to contain."
26.
27.
28. Marginalia in source code
Developers co-opt source code comments for
navigation and task management
– But can’t be shared and how they are used varied
according to developers’ sophistication with tools
M.-A. Storey, L.-T. Cheng, J. Singer, M. Muller, D. Myers, J. Ryall. 2007. How Programmers can Turn Commen
ts into Waypoints for Code Navigation. In Proceeding of: Software Maintenance, 2007. ICSM 2007.
29. Source code
comments
Tagging
Reputation
Wikis, social
networking, etc.
Question &
Answer Websites MicroBlogging
Blogging
31. Soc
ial b
folk ook
son
Tagging on the web omi markin
es g,
32. TagSEA: Tagging “waypoints”
in source code and gathering into “tours”
M.-A. Storey, J. Ryall, J. Singer, D. Myers, L.-T. Cheng, M. Muller, 2009.
How Software Developers Use Tagging to Support Reminding and Refinding. IEEE Transactions on Software
Engineering (TSE), 2009.
33. Tagging in
Studied introduction and adoption of tags by
several teams for work items
C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In
IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.
34. Tagging in
Findings:
– Categorization (cross cutting concerns)
– Organization
– Finding and refinding
– Team work practices emerged
C. Treude and M.-A. Storey. Work Item Tagging: Communicating Concerns in Collaborative Software Development. In
IEEE Transactions on Software Engineering 38, 1 (January/February 2012). pp. 19-34.
40. Microblogging
Software engineers tweet actively (share) facts about
software engineering topics and technology
G. Bougie, J. Starke, M.-A. Storey and D. German. Towards Understanding Twitter Use in Software Engineering: Preliminary Findings
Ongoing Challenges and Future QuestionsIn Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering. 2011.
41. Should microblogging be integrated in the IDE
for the enterprises?
W. Reinhard, Communication is the key – Support Durable Knowledge Sharing in Software Engineering by
Microblogging. SENSE 2009.
A. Guzzi, M. Pinzger, A. van Deursen. Combining
micro-blogging and IDE interactions to support developers in their quests. ICSM 2010.
42. Source code
comments
Reputation
Tagging
Wikis, social
networking, etc.
Question &
Answer Websites MicroBlogging
Blogging
44. Blogging (1)
“Our internal blog (only readable by those at the
company) is really a virtual "water cooler". We are
expected to blog there at least once a week (spelling and
grammar don't matter) just to keep others at the
company updated on what we are doing. If you can't
find something interesting to say at least once a week,
then you're not doing enough interesting work. Blogs are
everything from: I hit this really annoying bug, to I
completed this new awesome feature.”
(Ian Bull, Software Engineer, EclipseSource)
45. Blogging (2)
Determining requirements through blogs
[Park and Maurer, CHASE 2009]
How developers blog: high-level concept
discussion and requirements
[Pagano and Maalej, MSR 2011]
Blogs play a role in documenting APIs
[Treude and Parnin, Web2SE 2011]
46. Source code
comments
Reputation Tagging
Wikis, social
networking, etc.
MicroBlogging
Question &
Answer Websites
Blogging
47. Question and Answer
Websites
What role do Question and Answer websites play in
software engineering?
51. Over 92% of the questions on
Stackoverflow are answered, and for those
92% the median answer time is 11 minutes
L. Mamykina, B. Manoim, M. Mittal, G. Hripcsak, and B. Hartmann.
Design lessons from the fastest q&a site in the west. CHI 2011.
52. Stackoverflow
How-to questions prevalent, and used frequently
by novices
C. Treude, O. Barzilay and M.-A. Storey. How do Programmers Ask and Answer Questions on the Web?
NIER/ICSE 2011.
53. Linking Stackoverflow data with
API usage
C. Parnin, C. Treude, L. Grammel and M.-A. Storey.
Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”.
Under submission, May 25 2012, related blog (15,000 hits so far).
54. Stackoverflow as Crowd Documentation
Coverage of API documentation: 77% of the
Java API classes & 87% of Android API classes
Speed of coverage:
C. Parnin, C. Treude, L. Grammel and M.-A. Storey.
Crowd Documentation: Exploring the Coverage and the Dynamics of API Discussions on Stack Overflow”.
Under submission, May 25 2012, related blog (15,000 hits so far).
59. Social Coding in GitHub
GitHub supports transparency
Management of profiles and their visibility
important for project success
Explicit self promotion not valued
L. Dabbish, H.C. Stuart, J. Tsay and J. Herbsleb. Social coding in github
: transparency and collaboration in an open software repository. CSCW 2012.
61. Developer motivations
“As a software developer, you need to learn a lot. It's a
constant challenge to keep up with technologies. […] It's
the proof that you have that kind of mindset, if you do it
[learning, keeping up] anyway in your free time”
“[When] I look at repos around this topic [...], I may be
interested in seeing the coder footprint of people that
work in this area [...] their favorite languages, the
topics they write code about, what they work on”
L. Singer, F. F. Filho, B. Cleary, C. Treude, M.-A. Storey, K. Schneider.
Mutual Assessment in the Social Programmer Ecosystem: An Empirical Investigation of Developer Profile Aggreg
Under submission, June 2012.
62. Recruiter motivations
Social connections for finding candidates that are
passionate, learn quickly and with diverse skills
Mutual Assessment in the
Software Ecosystem
L. Singer, F. F. Filho, B. Cleary, C. Treude, M.-A. Storey, K. Schneider.
Mutual Assessment in the Social Programmer Ecosystem: An Empirical Investigation of Developer Profile Aggreg
Under submission, June 2012.
63. Source code
comments
Reputation
Tagging
Wikis, social
networking, etc.
Question &
Answer Websites MicroBlogging/
Blogging
Community
Portals
64. Wikis etc...
• Wikis useful for documentation, requirements
engineering, knowledge sharing
• Impact of social networking in software engineering
(Codebook, Github) - can also follow software artifacts!
• Crowdsourcing of coding (TopCoder) and testing
(e.g. Google’s A/B testing approach)
• End-user involvement in closed, open source and
mixed initiative projects
• Community portals in software communities
65. Making sense of the social media ecosystem
(the social era) in software engineering
Source code
comments
Reputation
Tagging
Wikis, social
networking, etc.
Question & Answer
Microblogging
Websites
Blogging
69. Retrieves??
Programmer “rock stars”
Oral culture (talkbacks
on blogs)
End-user programmers
Portfolios
“On Twitter, I follow a few prominent software developers. For example,
Kelly Sommers from Canada, she’s constantly trying new things. I don’t
think she ever sleeps. So she’s a great source of inspiration.”
(From the Reputation study)
71. Reverses??
Geek culture
Reliance on search
Interruptions
Security holes
Spaghetti code
“Google as the most important member on
your programming team”, Brendan Cleary
73. Obsolesces??
Formal documentation
In-house expertise,
certain jobs
Need for co-location
Classroom education
Email lists
CVs
"It's always good to document a widget, but it's more important in
many cases to document a process [...]. It's the context of how
you use the widget that's much more important."
74. (Distributed) Community formation, Community fragmentation,
awareness, transparency, informal processes,
knowledge curation, geek culture,
learning, reuse, reliance on search,
reputation security concerns,
interruptions,
advertisements
Social
Media
Programming gurus,
end users as developers, In-house expertise/jobs,
verbal discussions, formal documentation,
portfolios, classroom education,
communities of practice CVs, email lists,
need for co-location
75. 7 Burning Questions about Social
Media use in Software Engineering...
Source code
comments
Reputation
Tagging
Wikis, social
networking, etc.
Question & Answer
Websites Microblogging
Blogging
76. Q1: Towards the “Social
Programmer”?
• What makes a good developer?
– Ability to write good code… or…
– Ability to search for good code and to network?
Do social skills matter?
• Can you assess a programmer’s ability
independently of the larger community?
77. Q2: Classroom education still relevant?
Knowledgeable -> Knowledge-able
http://www.academiccommons.org/commons/essay/knowledgable-knowledge-able, by Michael Wesch
78. Q3: Gamification and marginalization?
Less than 13% of Wikipedia content is authored
by women, fewer than 9% of editors are women
Why does this matter?
Less female oriented content
Fewer opportunities to gain
expertise, build portfolios, reputation
“Define Gender Gap? Look up Wikipedia’s Contributor List. New York Times, January, 2011.
.
http://blog.20sb.net/2011/10/changing-the-ratio-on-wikipedia.html
79. Q4: Information overload?
“obsessive web browsing can cause attention
spans to drop to as little as nine seconds—
equivalent to a goldfish”, Ted Selker, MIT 2002
80. Q5: Impact on design and documentation?
What are the risks of using social media for
requirements gathering/elicitation?
Does the use of social media lead to a
“laissez-faire” documentation approach?
81. Q6: Impact on software quality?
Does social media use lead to:
Spaghetti code and brittle integrations?
More (viral) bugs? Security concerns?
Undesirable clones?
More license violations?
Poor code ownership?
82. Q7: Impact on mining methods?
Can mining of social media lead to improved
predictions, detections and recommendations?
Challenge!! mining an ecosystem of media!
84. Mining community takeaways
The social era in software engineering begs us
to ask different kinds of questions
Social media ecosystems as an integral
component of software repositories (or vice
versa?)
Abundant and exciting mining opportunities!
Combine with other research methods
85. Software
Goals
repositories
Goals first...
MSR 2012-...
“If we understand the revolutionary transformations caused by new
media, we can anticipate and control them; but if we continue in our
self-induced subliminal trance, we will be their slaves.” Marshall
McLuhan, 1974
87. IEEE Software special issue
Bridging Software Communities through Social
Networking
Papers due June 25th, 2012
http://www.computer.org/portal/web/computingnow/swcfp1
Editors:
Jan Bosch, Chalmers University of Technology, Sweden
Margaret-Anne Storey, University of Victoria, Canada
Andrew Begel, Microsoft Research, USA
88. (Distributed) Community formation, Community fragmentation,
awareness, transparency, informal processes,
knowledge curation, geek culture,
learning, reuse, reliance on search,
reputation security concerns,
interruptions,
advertisements
Social
Media
Programming gurus,
end users as developers, In-house expertise/jobs,
verbal discussions, formal documentation,
portfolios, classroom education,
communities of practice CVs, email lists,
need for co-location
89. Additional References
C. Treude and M.-A. Storey.
Effective Communication of Software Development Knowledge Through C
ESEC/FSE ’11.
Communities of practice: http://www.ewenger.com/theory/
See the following two links for other references on social media use in
software engineering:
M.-A. Storey, C. Treude, A. van Deursen and L.-T. Cheng.
The Impact of Social Media on Software Engineering Practices and Tools.
In FoSER ’10: Proceedings of the FSE/SDP workshop on Future of software
engineering research.
Christoph Treude’s Blog: http://www.ctreude.ca/
See also this year’s and last’s MSR proceedings for some new work on this
topic.
Editor's Notes
MAIN RESEARCH FOCUS: COGNITIVE SUPPORT FOR SOFTWARE ENGINEERS AND KNOWLEDGE ENGINEERS. HAVE DONE A LOT OF WORK IN VISUALIZATION, BUT LATELY STARTED FOCUSING ON SOCIAL MEDIA USE AND HOW IT CAN BE LEVERAGED TO ENHANCE SOFTWARE ENGINEERING ACTIVITIES.
Research is now proceeding to uncover the ways in which mining these repositories can help to understand software development and software evolution, to support predictions about software development, and to exploit this knowledge concretely in planning future development.
I firmly believe that social media is causing a revolution in how software engineers work, and in the software that is being built today. We are definitely witnessing a paradigm shift, and the game itself is changing, not just the rules of the game. 1) work on problems that are of critical importance and of increasing urgency 2) (which is itself being constantly redefined) Before I delve into the topic of the keynote (the role of social media in software engineering), I first want to relate what I do to the research field of Mining Software Repositories. Really three parts to my talk: first discuss what is social media, how can we evaluate it. Secondly I review how social media is being used in software engineering today. This isn’t a comprehensive view, so I focus on the research that my collaborators and I have done. And finally I think critically at the impact of social media in software engineering, and look at how MSR research can play a role in emerging software practices and software ecosystems.
I’m not saying no MSR work have considered these aspects, some have and in particular in the past few years, but the emphasis has been on the product itself.
“ ...community that acts as a living curriculum for the apprentice”
In addition to broadening goals, we need to perhaps also rethink what is meant by a software repository (and if perhaps that term is a bit limiting). Social media as I hope I will convince you in the remainder of this talk is so much more than communication media. although it certain overlaps communication media in some ways.
emphasize the many innovations used by software engineers were developed by software engineers.
Forums: for knowledge sharing Email lists: to coordinate work, e.g. peer review in open source projects VNC: for distributed same time development ICQ: for real time communication and community building
The use of these communication tools in conjunction with sophisticated IDEs, as well as the rise of software ecosystems and need for distributed development of large, ultra-scale projects, has led to what Paul Dourish refers to a space to place transformation. It is no longer sufficient to store versions and configurations of software, and to communicate about the shared development, but rather the tools have become a place where developers, meet, hang out, learn from each and work together more effectively.
"architecture of participation” that supports crowdsourcing as well as a many-to-many broadcast mechanism [14]. Their design supports and promotes collaboration, often as a side effect of individual activities, and furthermore democratizes who participates in activities that were previously in the control of just a few stakeholders. Social media has played a huge role in this place to space transformation. Social media is however hard to define -- a few years it was perhaps poohed poohed a bit as a trend that would go away and wouldn’t seriously have an impact on software engineering, but now there is broad realization and acceptance that social media is changing how people socialize, play, learn and work together. The encyclopedia and the newspaper being prime examples of the change it can have. Social media is however rather difficult to define... the best way to define it is by a set of principles. it isn’t one form of medium (as perhaps the TV was), but it is an ecosystem of channels that support social networking. It also has the features of Web 2.0 -- which are: (see slide). What is interesting about social media is that many of the features adopted and co-opted by developers was not developed to support software engineers, but rather they were innovated for other communities (think “the facebook”).
TRIBAL ERA -> PRINT ERA -> DIGITAL ERA -> INFORMATION ERA So most people are now in agreement that social media has taken the world by storm.... for researchers, it is now up to us to make sense of that impact! But where to start in that regard? Marshall McLuhan, a Canadian that died in the 1980’s, had uncanny premonitions of what was ahead as far as back as the 1960’s... he wrote some interesting books on understanding the laws of media, and he described that humans have moved from a tribal era to a print era to a digital era (e.g. TV) to an information era. I suspect if he were alive today, he might say we are now living in a social era.
But it is best to let Marshall McLuhan tell you himself about his work and insights.... (I’m not sure what he would have thought about YouTube by the way!)
I put these quotes on the slide because the audio is quite bad on the youtube video (the original clips are rather old). “ T he medium is the message. This is merely to say that the personal and social consequences of any medium - that is, of any extension of ourselves - result from the new scale that is introduced into our affairs by each extension of ourselves, or by any new technology." It isn’t just the content that matters, but rather how it is delivered. h t t p ://wiki.csisdmz.ul.ie/wiki/Marshall_McLuhan_-_The_Medium_is_the_Message In terms of frame quote mention early studies of TV showed no real difference from print media because they used methods that applied to investigating text, but different questions and methods for studying TV were needed... We are living right now I believe on this frontier of change, and although it is exhilarating it is also terrifying!
In addition to some thought provoking quotes, McLuhan also gave us a tetrad as a way of thinking about media. Specially it poses a set of four high level questions we should pose when trying to understand the impact of media in a new domain or community of practice, or understand a new form of media. Examples: Car enhances: speed; cell phone enhances use of voice. Car reverses into gridlock traffic; cell phone reverses into being a leash. Car retrieves the notion of knights in armour; cell phone retrieves the use of cameras. Car makes obsolete horse and buggy ; cell phone makes obsolete telephone booth. Taken from: http://www.collectionscanada.gc.ca/innis-mcluhan/030003-2000-e.html
So far, set the stage by telling you what social media is about and how to go about understanding it using McLuhan's tetrad, and then next I will give a review of how social media has been playing a role in software engineering. This will be followed by some reflections on its use as well as concrete suggestions for future studies that may be of interest to MSR researchers.
As I mentioned, social media isn't just one thing -- but rather it is a set of channels that are often used together in some combination. This graph shows a selection of channels -- and in particular the outside ring shows some of the channels my collaborators and I have investigated and thus have some insights on their use in software engineering. My goal now is to go through these channels and share with you some highlights of our findings from the studies of these channels. The center piece captures channels we have not explored, but other researchers have and I'll mention those references briefly.
Before giving some highlights on the studies we've done, I share firstly an overview of the general research methods we use. We sometimes use mining, but basically we start with the question or goal of the study, and from that choose whichever methods are most suitable. As I'm a pragmatist I try to use whichever method will work Won’t go into details as too boring but details are in all our papers. Important to note that we choose our questions first, and then our methods. Choose methods which suit our questions.
I'm going to start off by sharing with you a study I did which started off for me the entire line of looking at social media. It started by looking at source code comments which although at first glance may not seem like a form of communication or social media but they are. We found that they support articulation work and that developers talk to each other through them (although in a rather passive manner).
Do they add value Or are they junk? Quite controversial, to comment or not! Interested to see if additional tool suport could help.
Fermat’s tool support failed in the McLuhan print era Marginalia is the general term for notes, scribbles, and editorial comments made in the margin of a book – strictly speaking in the margin, but used to refer to more than that these days It has been done in a similar way for thousands of years, in fact If you ask someone what system they use, they probably can’t tell you! Area not that well understood, but is under investigation, there is a book on the topic by Jackson.
Here is a glimpse on why I find this interesting… If you search through Google Code Search for key words such as “XXX” or “hack” or “This is Buggy” you find thousands of hits and interesting comments by developers. For example….
We get similar results if we search through Krugle (actually the search API is easier to use on the web for Krugle). This is so prevalent that modern tools now have explicit tool support for navigating to special predefined keywords
Knowledge management has always been a challenge in software engineering….
At the same time as we were examing source code comments, social tagging had really taken off. Especially with sites such as CiteULike and Flickr. This was interesting to me with my background in formal ontology modeling that a social bottom up and emergent mechanism such as this was working so well. I also realized that tagging could be useful in software development for programers trying to document locations in the code.
While on the one hand we have scientific communities begging for more consensus and more formal methods of annotations, we have at the same time an increase in social computing where lightweight, informal tools are being used to annotate data, but in very successful ways (that shouldn’t be ignored by even the formally driven communities) How can these lightweight social mechanisms be leveraged by a community or even just team of developers?? Altruisitic aspect to this that was also intriguing, people don't tag for their own benefit.
So we set out to explore this approach by co-opting source code comments and their TODO annotations (something developers did already) but changed the mechanism to allow a developer to tag within the source code comments, we added tool support for managing and sharing the tags, and also added the ability to tag outside the code so that tags wouldn't litter the code unless the developer wished for them to be there. We also allowed the developer to tag any file in their repository, such as manifest files, documentation and even breakpoints in their debugger. We added this feature to Eclipse, called TagSEA, and in several studies (one over the course of two years), studied how developers used this feature. We found that the feature was useful to most developers and they used the tags for documenting concerns and for finding and refinding, and for documentation. We also introduced the phrase waypoints rather than bookmarks because although what we were doing was social bookmarking, really it was much more than that as we also stored other metadata which is not that common in social bookmarking systems such as when the tagging occurred and who did it. Finally, these waypoints could be gathered into Tours.
Allowed evaluation in the large Although IBM was instrumental in allowing us to study the adoption (or non adoption) of source code tagging in Eclipse, a big challenge for us as researchers was to study large scale adoption, something which is of course fundamental to have with social media use! We were working with IBM on their Jazz tool, which is a collaborative IDE based on Eclipse. Jazz has collaboration built in from the ground up as a primary concern. Jazz has built in support for managing work items (or bugs as they are called in BugZilla). Managing and navigating work items was problematic for them, and inspired by our tool and experiments with TagSEA, they added tagging to their work item feature. This allowed us the chance to study how tags were used from the beginning of the tool feature addition, and to study the adoption and use of tags over time. The feature was quickly adopted and we found that they used it for these items....
Polish as in country but not Cross cutting concerns, as well as milestones in the development.
As I mentioned earlier we do these studies to inform tool design, perhaps with our collaborators IBM, or we develop prototypes ourselves and evaluate them. One such tool we developed, actually during the study, as a mechanism to solicit data from our informants during our interviews was to visualize how tags were used over the timeline of the project. Here we see again concerns coming and going such as when to polish or features being tagged such as svt. Milestone tags also appear and disappear with the relevant milestones. (could highlight these things in the figure).
Another tool we developed was a workitemexplorer -- we developed this following a different study of how dashboards and feeds are used by the Jazz community. Christoph Treude will present this on Friday at 10:45 at ICSE.
The next social media feature we looked at was microblogging
This was for me somewhat inspired by how microblogging and blogging was so effective in the courses I was teaching at UVic. It can be hard to look out at a room and see everyone on a computer, so rather than try to fight that, I engaged my students to tag and blog about the course content during class for course credit. This was extremely effective as a teaching tool and students afterwards said that if they hadn't been doing that, they woudl have been on facebook. Also an advantage for me as a teacher was when I was away and missed a class, I could virtually be there by watching twitter. Indeed students kept tweeting after class and so I felt the course really had a heartbeat after the classes. The same thing happens at conferences of course.
So, We conducted a small study to see how software engineers use Twitter. We analyzed a sample of tweets from three different software engineering communities: Eclipse, Linux and MXUnit. Analyzed just about 600 tweets from about 12,000 tweets.
e.g. Yammer is used by many companies already .... tweets can be kept private to a group or company.
http://www.slideshare.net/dennispagano/how-do-developers-blog-an-exploratory-study Pagano and Maalej looked at both blogs and commit messages, and found a relationship between blogging and committing behaviour, 42% discussed functional requirements and domain concepts, 38% discussed community news, 30% discussed APIs and project documentation. Source code seldom discussed, higher level concepts discussed. Most blog posts occur after corrective commits. 15% of blogs contain information already discussed in commit messages. Dependencing between two decreases over time. Bug fixes frequently shared. By analyzing the Google results for API calls of the jQuery API, they found that 87.9% of the API methods were covered by blogs, mainly featuring tutorials and personal experiences about those API methods.
Now we are coming to the last two channels I will discuss in some detaiil that I will provide some glimpses into.
Paper on this at MSR yesterday morning. Looked at reward mechansims and moderation systems.
Here we are seeing gamification at work -- which changes not just the rules of the game, but also the game itself. The site uses gamification concepts — “the use of game design elements in non-game contexts” [9] — to encourage and reward community participation. For example, users receive points for posting questions and providing answers, and win “badges” for specific services or contributions to the community.
I became interested in stackoverlow after asking in a 4th year software evolution course how the students in my course learn new knowledge when they join a team. the answer I received was that they were told in their co-op jobs not to ask questions of their colleagues unless they had checked Stackoverflow first!
It's a race to answer the easy to answer questions (developers want to answer to gain those reputation points, the gamification part of the site).
We pose the following five research questions: What kinds of questions are asked on Q&A websites for programmers? Which questions are answered and which ones remain unanswered? Who answers questions and why? How are the best answers selected? How does a Q&A website contribute to the body of software development knowledge? For the NIER paper, we focused on the first two questions. We created a script to extract questions along with all answers, tags and owners using the Stack Overflow API . We then analyzed quantitative properties of questions, answers and tags, and we applied qualitative codes to a sample of tags and questions. Our preliminary findings indicate that Stack Overflow is particularly effective at code reviews, for conceptual questions and for novices. The most common questions include how-to questions and questions about unexpected behaviors.
Chris Parnin led this reseearch in collaboration with others in my group.
By yesterday, the blog was read 13,500 times in just a few days, and it was retweeted by Jeff Attwood from Stackoverlow and appeared on reddit. This is the kind of research that provokes a reaction, so I think we are striking a chord there. GWT is smaller here because of smaller user base. We pose the following five research questions: What kinds of questions are asked on Q&A websites for programmers? Which questions are answered and which ones remain unanswered? Who answers questions and why? How are the best answers selected? How does a Q&A website contribute to the body of software development knowledge? For the NIER paper, we focused on the first two questions. We created a script to extract questions along with all answers, tags and owners using the Stack Overflow API . We then analyzed quantitative properties of questions, answers and tags, and we applied qualitative codes to a sample of tags and questions. Our preliminary findings indicate that Stack Overflow is particularly effective at code reviews, for conceptual questions and for novices. The most common questions include how-to questions and questions about unexpected behaviors.
In terms of automatically generating documentation, Chris has some concrete ideas on his blog for this paper. This is a tool developed by our research group as well to visualize coverage and saturation of stackoverflow documentation to the different APIs. It shows a treemap visualization of the packages and classes in the API, and colors them relative to the amount of documentation that can be found on stackoverflow. This visualization can give an idea of popularity since they are documented frequently, also used frequently. In terms of the visualization, more importantly perhaps it can help show gaps in the crowd documentation e.g. we noted that accessibility and Digital Rights Management has little documentation, and yet they are very important concerns.
Post today that Jazz.net has adopted a Q&A format over the forum in their community portal...
Menti
Also a paper on github yesterday at MSR -- broad areas of data.
Elements of a Masterbranch profile: (a) the profile itself; (b) programming skills; (c) details for a project. What is interesting about these sites is that they are about the developer, as opposed to websites such as OLOH which are about the projects. Note 13,500 users on Masterbranch. 15,000 on coderwall. We conducted a study with users of Coderwall and Masterbranch -- two such reputation sites that manage skills and experiences of programmers by connecting automatically to stackoverflow. Research questions: why do different actors partake in these sites, (actors: developers, recruiters, companies), how do they interact with one another, what is the impact of that participation on them, what are risks and challenges of it. We conducted a questionnaire and received 83 responses. 74 from CW users. Most were software engineers (68), 14 from team leaders, and interviewed 26 people -- 12 recruiters, 14 were software developers.
Mention MSR paper on stackoverflow that looked at rewards yesterday.
There has been other research but of course don't have time to go into all those details in one talk.
Ward Cunningham, a developer and software engineer designed Wikis. Wiki use in software engineering is quite mature. We did a study in our group on community portals.
If McLuhan was alive today, I think he would have called our current era we are in the "social era". Marshall Mcluhan on media ecology: http://en.wikipedia.org/wiki/ Media_ecology
Brian Kernighan and original C/unix program perhaps now more accessible rockstars through social media. Viral spread of knowledge -- used to be physicists, but not many end user experts are learning how to program and mashup solutions using a variety of easy to use and combine services. Note: portfolios are public. Used to have use these before we had degrees in computer science.
Geek culture: increased gamification, but also quite arrogant tone in some of these sites. Small team of developers, the internet was down for s1.5 days just before a sprint... really nice story about how they had everything in house, complete with their repositories, test server and so on. And yet Reliance on Search: Brendan Cleary's story of what happened to him.... I was managing a small development team of about 4 developers. We were working across the entire stack of a website that would eventually support hundreds of thousands of users ; database, business layer, frontend and backend UI. About 3/4 way through a sprint with launch day looming, our internet connection went down, and stayed down for 1.5 days. As PM I thought I had planned for this, I thought we would be ok, all of our code was hosted locally and I had resisted temptation to put our bug tracking systems into the cloud, finally we had local test servers where we could deploy test builds. Yes email would be down so I would have to be on the phone a little more that day with the clients than usual but still not the end of the world right? Wrong. What I hadn't realized, and what I couldn't have realized until our means to communicate was taken away was that we weren't actually a team of 5 but rather a team of 6. That 6th team member they didn't have a desk, they didn't get a computer, in fact we treated them pretty badly we never paid them, but still they played a pivotal role in allowing us to write code. Our 6th team member was Google, and what I realized when we were no longer able to talk to them was just how dependent we as developers had become on having on demand access to the wealth of software debugging knowledge stored on the web through Google. In a few hours our productivity dropped drastically, really we probably lost about 4 to 5 developer days due to us having to manually debug issues which would have been solved in 5 minutes with a Google query. I later tried to analyzed why we were affected in this way, these were all very good programmers, they knew their stuff, it wasn't as if they couldn't write code without referring to examples. But really it highlighted the changing nature of software development practice. it made me realize just how many 3rd party libraries, platforms, languages etc were required to implement the solution we were developing. No one person could be an expert in the entire stack anymore, there were just too many components each with its own little kinks and bugs that only show up when you try to get it to work with another random component. We had become dependent on Google to allow us implement the kind of solution we needed to implement and the scary thing was we really didn't have a choice.
Reputation study showed us that some recruiters at least are valuing broader expertise and ability to learn / network, as oppposed to deep knowledge.
As a summary....
Is it really the social skills that will matter? By social skills I don’t mean that they are good at talking to people, but they are well connected in the networks and know how to find the right help and how to share?
Need to watch for what is lost or reversed through this paradigm shift! See also http://www.academiccommons.org/commons/essay/knowledgable-knowledge-able “ There is something in the air, and it is nothing less than the digital artifacts of over one billion people and computers networked together collectively producing over 2,000 gigabytes of new information per second. While most of our classrooms were built under the assumption that information is scarce and hard to find, nearly the entire body of human knowledge now flows through and around these rooms in one form or another, ready to be accessed by laptops, cellphones, and iPods. Classrooms built to re-enforce the top-down authoritative knowledge of the teacher are now enveloped by a cloud of ubiquitous digital information where knowledge is made, not found, and authority is continuously negotiated through discussion and participation.” by Michael Wesch
This is something we need to pay attention to as we are seeing a trend towards programmers gaining experience in open source communities as well as building their portoflios and reputations through these social mechanisms. http://blog.20sb.net/2011/10/changing-the-ratio-on- wikipedia.html http://blog.wikimedia.org/c/community/gender-gap-community/ htt p://www.nytimes.com/2011/01/31/business/media/31link.html
Gotta be having a negative effect.
Turning from process towards product now.... Lobbying risk with requirements? Important but infrequently used APIs may be missed (e.g. accessibility, or perhaps sometimes security concerns)
Brittle architectures because of goldfish effect. Viral bugs may also see viral fixes of course....
The medium is the message Leads to new frameworks, so need different methods.