This document proposes an approach to study the impact of collaboration on software systems through mining development repositories. The approach involves:
I. Extracting communication data such as source code comments, emails, and issue discussions from version control systems, mailing lists, and issue tracking systems.
II. Studying the impact of collaboration on software quality by computing social metrics from the extracted communication data and measuring their relationship to post-release defects.
III. Studying the impact of collaboration on the development community by analyzing data on how code contributions are managed, such as feedback and reviews, to understand how contributors, reviewers, and the software are affected by communication.