The document provides guidance on conducting empirical research for a PhD study. It discusses defining research questions, choosing appropriate research methods like interviews and surveys, collecting and analyzing data, and iterating on the research design. The speaker emphasizes starting with exploratory qualitative methods like interviews to understand phenomena before employing explanatory quantitative methods such as surveys to evaluate solutions. Examples are provided for combining methods through a multi-phase study.
How Can Software Engineering Support AIWalid Maalej
Flipping the Coin: How can Software & Requirements Engineering Support AI?
During the last decade, the Software Engineering and Requirements Engineering communities have profited much from advances in Machine Learning and in Natural language Processing. Recommender systems, prediction models, and even Bots are nowadays available to support many software and requirements engineering tasks: including quality assurance, documentation, or even code generation and completion.
This talk will focus on the opposite direction. I will discuss recent challenges faced by the Machine Learning/ NLP/ Data Science community and whether/how traditional as well as modern Software and Requirements Engineering can help solve some of them: in order to increase the applicability, acceptance, and reliability of Machine Learning based systems.
Walid Maalej is a professor for informatics and chair for applied software technology at the University of Hamburg, Germany. Currently he is also the Head of the Informatics Department and a member of the Board of Directors of the tech transfer institute HITeC e.V. His main research interests includes human- and data-centered software engineering, requirements engineering, feedback systems, applied machine learning, as well as tech transfer.
How Does a Typical Tutorial for Mobile Development look like? - A research paper presented at the 2014 International Conference on Mining Software Repositories. Paper preprint available here: http://mobis.informatik.uni-hamburg.de/research/publications
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...Walid Maalej
App stores allow users to submit feedback for downloaded apps in form of star ratings and text reviews. Recent studies analyzed this feedback and found that it includes information useful for app developers, such as user requirements, ideas for improvements, user sentiments about specific features, and descriptions of experiences with these features. However, for many apps, the amount of reviews is too large to be processed manually and their quality varies largely. The star ratings are given to the whole app and developers do not have a mean to analyze the feedback for the single features. In this paper we propose an automated approach that helps developers filter, aggregate, and analyze user reviews. We use natural language processing techniques to identify fine-grained app features in the reviews. We then extract the user sentiments about the identified features and give them a general score across all reviews. Finally, we use topic modeling techniques to group fine- grained features into more meaningful high-level features. We evaluated our approach with 7 apps from the Apple App Store and Google Play Store and compared its results with a manually, peer-conducted analysis of the reviews. On average, our approach has a precision of 0.59 and a recall of 0.51. The extracted features were coherent and relevant to requirements evolution tasks. Our approach can help app developers to systematically analyze user opinions about single features and filter irrelevant reviews.
The methods of exploratory testing has gained significant attention in industry and research in the last years. However, as many “buzzword" technologies, the introduction and application of exploratory testing is not straightforward. Exploratory testing it is not only black or white - scripted or exploratory - but also all shades of grey in between. Within the EASE industrial excellence center, we have executed an industrial workshop on exploratory testing, that helps providing understanding of how to choose feasible levels of exploration in exploratory testing. We will present the concepts of levels of exploration in exploratory testing, the outcomes of the workshop, along with relevant empirical research findings on exploratory testing.
Empirical Methods in Software Engineering - an Overviewalessio_ferrari
A first introductory lecture on empirical methods in software engineering. It includes:
1) Motivation for empirical software engineering studies
2) How to define research questions
3) Measures and data collection methods
4) Formulating theories in software engineering
5) Software engineering research strategies
Find the videos at: https://www.youtube.com/playlist?list=PLSKM4VZcJjV-P3fFJYMu2OhlTjEr9Bjl0
How Can Software Engineering Support AIWalid Maalej
Flipping the Coin: How can Software & Requirements Engineering Support AI?
During the last decade, the Software Engineering and Requirements Engineering communities have profited much from advances in Machine Learning and in Natural language Processing. Recommender systems, prediction models, and even Bots are nowadays available to support many software and requirements engineering tasks: including quality assurance, documentation, or even code generation and completion.
This talk will focus on the opposite direction. I will discuss recent challenges faced by the Machine Learning/ NLP/ Data Science community and whether/how traditional as well as modern Software and Requirements Engineering can help solve some of them: in order to increase the applicability, acceptance, and reliability of Machine Learning based systems.
Walid Maalej is a professor for informatics and chair for applied software technology at the University of Hamburg, Germany. Currently he is also the Head of the Informatics Department and a member of the Board of Directors of the tech transfer institute HITeC e.V. His main research interests includes human- and data-centered software engineering, requirements engineering, feedback systems, applied machine learning, as well as tech transfer.
How Does a Typical Tutorial for Mobile Development look like? - A research paper presented at the 2014 International Conference on Mining Software Repositories. Paper preprint available here: http://mobis.informatik.uni-hamburg.de/research/publications
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...Walid Maalej
App stores allow users to submit feedback for downloaded apps in form of star ratings and text reviews. Recent studies analyzed this feedback and found that it includes information useful for app developers, such as user requirements, ideas for improvements, user sentiments about specific features, and descriptions of experiences with these features. However, for many apps, the amount of reviews is too large to be processed manually and their quality varies largely. The star ratings are given to the whole app and developers do not have a mean to analyze the feedback for the single features. In this paper we propose an automated approach that helps developers filter, aggregate, and analyze user reviews. We use natural language processing techniques to identify fine-grained app features in the reviews. We then extract the user sentiments about the identified features and give them a general score across all reviews. Finally, we use topic modeling techniques to group fine- grained features into more meaningful high-level features. We evaluated our approach with 7 apps from the Apple App Store and Google Play Store and compared its results with a manually, peer-conducted analysis of the reviews. On average, our approach has a precision of 0.59 and a recall of 0.51. The extracted features were coherent and relevant to requirements evolution tasks. Our approach can help app developers to systematically analyze user opinions about single features and filter irrelevant reviews.
The methods of exploratory testing has gained significant attention in industry and research in the last years. However, as many “buzzword" technologies, the introduction and application of exploratory testing is not straightforward. Exploratory testing it is not only black or white - scripted or exploratory - but also all shades of grey in between. Within the EASE industrial excellence center, we have executed an industrial workshop on exploratory testing, that helps providing understanding of how to choose feasible levels of exploration in exploratory testing. We will present the concepts of levels of exploration in exploratory testing, the outcomes of the workshop, along with relevant empirical research findings on exploratory testing.
Empirical Methods in Software Engineering - an Overviewalessio_ferrari
A first introductory lecture on empirical methods in software engineering. It includes:
1) Motivation for empirical software engineering studies
2) How to define research questions
3) Measures and data collection methods
4) Formulating theories in software engineering
5) Software engineering research strategies
Find the videos at: https://www.youtube.com/playlist?list=PLSKM4VZcJjV-P3fFJYMu2OhlTjEr9Bjl0
What does it take to have high impact in software engineering research? Andreas Zeller, a "high impact" SE researcher, shares his personal story and perspective.
Presentation of IEEE TSE Journal First paper at ICSE 2020
Abstract:
Developer satisfaction and work productivity are important
considerations for software companies. Enhanced developer satisfaction may improve the attraction, retention and health of employees, while higher productivity should reduce costs and increase customer satisfaction through faster software improvements. Many researchers and companies assume that perceived productivity and job satisfaction are related and may be used as proxies for one another, but these claims are a current topic of debate. There are also many social and technical
factors that may impact satisfaction and productivity, but which factors have the most impact is not clear, especially for specific development contexts. Through our research, we developed a theory articulating a bidirectional relationship between software developer job satisfaction and perceived productivity, and identified what additional social and technical
factors, challenges and work context variables influence this relationship. The constructs and relationships in our theory were derived in part from related literature in software engineering and knowledge work, and we validated and extended these concepts through a rigorously designed survey instrument. We instantiate our theory with a large software company, which suggests a number of propositions about the relative impact of various factors and challenges on developer satisfaction and perceived productivity. Our survey instrument and analysis approach
can be applied to other development settings, while our findings lead to concrete recommendations for practitioners and researchers.
Authors:
Margaret-Anne Storey, Tom Zimmermann, Chris Bird, Jacek Czerwonka, Brendan Murphy and Eirini Kalliamvakou
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
Industry-Academia Communication In Empirical Software EngineeringPer Runeson
Researchers in software engineering must communicate with industry practitioners, both engineers and managers. Communication may be about collaboration buy-in, problem identification, empirical data collection, solution design, evaluation, and reporting. In order to gain mutual benefit of the collaboration, ensuring relevant research and improved industry practice, researchers and practitioners must be good at communicating. The basis for a researcher to be good at industry-academia communication is firstly to be “bi-lingual”. Understanding and being able to translate between these “languages” is essential. Secondly, it is also about being “bi-cultural”.Understanding the incentives in industry and academia respectively, is a basis for being able to find balances between e.g. rigor and relevance in the research. Time frames is another aspect that is different in the two cultures. Thirdly, the choice of communication channels is key to reach the intended audience.A wide range of channels exist, from face to face meetings, via tweets and blogs, to academic journal papers and theses; each having its own audience and purposes. The keynote speech will explore the challenges of industry-academia communication, based on two decades of collaboration experiences, both successes and failures. It aims to support primarily the academic side of the communication to help achieving industry impact through rigorous and relevant empirical software engineering research.
Theory Building in RE - The NaPiRE InitiativeDaniel Mendez
Talk I gave on the "Naming the Pain in Requirements Engineering" initiative (www.re-survey.org) at the Seminar on Forty Years of Requirements Engineering – Looking Forward and Looking Back (RE@40) in Kappel am Albis, Switzerland
Lecture on case study design and reporting in empirical software engineering. The lecture touches on the topics of units of analysis, data collection, data analysis, validity procedures, and collaboration with industries.
Why is TDD so hard for Data Engineering and Analytics Projects?Phil Watt
This slide show describes the difficulties in implementing Test-Driven Development (TDD) in the context of analytics and data engineering in development and maintenance phases. If we assumes that the objective of TDD is to reduce cycle time, improve developer productivity and improve production quality. It identifies 7 challenges from the analytics literature and a further 10 from interviews (n=14) and survey respondents (n=20) selected from analytics leaders. A key theme emerging as an output is that many of the challenges can be addressed through education and coaching, notably around data literacy for key stakeholders and executives
Why is Test Driven Development for Analytics or Data Projects so Hard?Phil Watt
Preview of research results for my Master's thesis on Test-Driven Development in Analytics. Prepared for my Term 4 assignment, oral thesis presentation
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validityalessio_ferrari
Complete lecture on controlled experiments in software engineering. It explains practical guidelines on conducting controlled experiments and describes the concepts of dependent, independent, and control variables, significance, and p-value. It also explains how to select the appropriate statistic test for a hypothesis, and gives example of data for different typical tests.
Finally, it discusses threats to validity in controlled experiments and gives indications for reporting.
Find the video lectures here: https://www.youtube.com/playlist?list=PLSKM4VZcJjV-P3fFJYMu2OhlTjEr9Bjl0
This presentation is about a lecture I gave within the "Software systems and services" immigration course at the Gran Sasso Science Institute, L'Aquila (Italy): http://cs.gssi.it/.
http://www.ivanomalavolta.com
Conventional software engineering processes are rather transactional and lack a common theory for the involvement of users and their communities. Users are regarded as pure consumers, who are, at most, able to report issues. In the age of easy knowledge access and social media, discounting the users of software might threaten its success. Potentially valuable experiences and volunteered resources get lost. Frustrated users might even meet in social communities to argue against the software and harm its reputation.
The goal of this research is to revolutionize the role of users, dissolving the boundaries to software engineers. We propose a novel framework for increasing the software socialness, being the degree of user and community involvement in the software lifecycle. Our framework consists of a benchmark, a process, and a reference architecture. The benchmark includes metrics for assessing and monitoring software socialness. The process enables engineering teams to systematically gather and exploit user feedback in the software lifecycle. The context aware reference architecture integrates social media into software systems and the engineering infrastructure. It observes users’ interactions while they use the software and proactively collects in situ feedback.
(paper
What does it take to have high impact in software engineering research? Andreas Zeller, a "high impact" SE researcher, shares his personal story and perspective.
Presentation of IEEE TSE Journal First paper at ICSE 2020
Abstract:
Developer satisfaction and work productivity are important
considerations for software companies. Enhanced developer satisfaction may improve the attraction, retention and health of employees, while higher productivity should reduce costs and increase customer satisfaction through faster software improvements. Many researchers and companies assume that perceived productivity and job satisfaction are related and may be used as proxies for one another, but these claims are a current topic of debate. There are also many social and technical
factors that may impact satisfaction and productivity, but which factors have the most impact is not clear, especially for specific development contexts. Through our research, we developed a theory articulating a bidirectional relationship between software developer job satisfaction and perceived productivity, and identified what additional social and technical
factors, challenges and work context variables influence this relationship. The constructs and relationships in our theory were derived in part from related literature in software engineering and knowledge work, and we validated and extended these concepts through a rigorously designed survey instrument. We instantiate our theory with a large software company, which suggests a number of propositions about the relative impact of various factors and challenges on developer satisfaction and perceived productivity. Our survey instrument and analysis approach
can be applied to other development settings, while our findings lead to concrete recommendations for practitioners and researchers.
Authors:
Margaret-Anne Storey, Tom Zimmermann, Chris Bird, Jacek Czerwonka, Brendan Murphy and Eirini Kalliamvakou
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
Industry-Academia Communication In Empirical Software EngineeringPer Runeson
Researchers in software engineering must communicate with industry practitioners, both engineers and managers. Communication may be about collaboration buy-in, problem identification, empirical data collection, solution design, evaluation, and reporting. In order to gain mutual benefit of the collaboration, ensuring relevant research and improved industry practice, researchers and practitioners must be good at communicating. The basis for a researcher to be good at industry-academia communication is firstly to be “bi-lingual”. Understanding and being able to translate between these “languages” is essential. Secondly, it is also about being “bi-cultural”.Understanding the incentives in industry and academia respectively, is a basis for being able to find balances between e.g. rigor and relevance in the research. Time frames is another aspect that is different in the two cultures. Thirdly, the choice of communication channels is key to reach the intended audience.A wide range of channels exist, from face to face meetings, via tweets and blogs, to academic journal papers and theses; each having its own audience and purposes. The keynote speech will explore the challenges of industry-academia communication, based on two decades of collaboration experiences, both successes and failures. It aims to support primarily the academic side of the communication to help achieving industry impact through rigorous and relevant empirical software engineering research.
Theory Building in RE - The NaPiRE InitiativeDaniel Mendez
Talk I gave on the "Naming the Pain in Requirements Engineering" initiative (www.re-survey.org) at the Seminar on Forty Years of Requirements Engineering – Looking Forward and Looking Back (RE@40) in Kappel am Albis, Switzerland
Lecture on case study design and reporting in empirical software engineering. The lecture touches on the topics of units of analysis, data collection, data analysis, validity procedures, and collaboration with industries.
Why is TDD so hard for Data Engineering and Analytics Projects?Phil Watt
This slide show describes the difficulties in implementing Test-Driven Development (TDD) in the context of analytics and data engineering in development and maintenance phases. If we assumes that the objective of TDD is to reduce cycle time, improve developer productivity and improve production quality. It identifies 7 challenges from the analytics literature and a further 10 from interviews (n=14) and survey respondents (n=20) selected from analytics leaders. A key theme emerging as an output is that many of the challenges can be addressed through education and coaching, notably around data literacy for key stakeholders and executives
Why is Test Driven Development for Analytics or Data Projects so Hard?Phil Watt
Preview of research results for my Master's thesis on Test-Driven Development in Analytics. Prepared for my Term 4 assignment, oral thesis presentation
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validityalessio_ferrari
Complete lecture on controlled experiments in software engineering. It explains practical guidelines on conducting controlled experiments and describes the concepts of dependent, independent, and control variables, significance, and p-value. It also explains how to select the appropriate statistic test for a hypothesis, and gives example of data for different typical tests.
Finally, it discusses threats to validity in controlled experiments and gives indications for reporting.
Find the video lectures here: https://www.youtube.com/playlist?list=PLSKM4VZcJjV-P3fFJYMu2OhlTjEr9Bjl0
This presentation is about a lecture I gave within the "Software systems and services" immigration course at the Gran Sasso Science Institute, L'Aquila (Italy): http://cs.gssi.it/.
http://www.ivanomalavolta.com
Conventional software engineering processes are rather transactional and lack a common theory for the involvement of users and their communities. Users are regarded as pure consumers, who are, at most, able to report issues. In the age of easy knowledge access and social media, discounting the users of software might threaten its success. Potentially valuable experiences and volunteered resources get lost. Frustrated users might even meet in social communities to argue against the software and harm its reputation.
The goal of this research is to revolutionize the role of users, dissolving the boundaries to software engineers. We propose a novel framework for increasing the software socialness, being the degree of user and community involvement in the software lifecycle. Our framework consists of a benchmark, a process, and a reference architecture. The benchmark includes metrics for assessing and monitoring software socialness. The process enables engineering teams to systematically gather and exploit user feedback in the software lifecycle. The context aware reference architecture integrates social media into software systems and the engineering infrastructure. It observes users’ interactions while they use the software and proactively collects in situ feedback.
(paper
User Involvement in Software Evolution Practice: A Case StudyDennis Pagano
User involvement in software engineering has been researched over the last three decades. However, existing studies concentrate mainly on early phases of user-centered design projects, while little is known about how professionals work with post-deployment end-user feedback. In this paper we report on an empirical case study that explores the current practice of user involvement during software evolution.
We found that user feedback contains important information for developers, helps to improve software quality and to identify missing features. In order to assess its relevance and potential impact, developers need to analyze the gathered feedback, which is mostly accomplished manually and consequently requires high effort. Overall, our results show the need for tool support to consolidate, structure, analyze, and track user feedback, particularly when feedback volume is high. Our findings call for a hypothesis-driven analysis of user feedback to establish the foundations for future user feedback tools.
Business Rules In Practice - An Empirical Study (IEEE RE'14 Paper)Walid Maalej
Business rules represent constraints in a domain, which need to be taken into account either during the development or the usage of a system. Motivated by the knowledge reuse potentials when developing systems within the same domain, we studied business rules in a large software company. We interviewed 11 experienced practitioners on how they understand, capture, and use business rules. We also studied the role of business rules in requirements engineering in the host organization. We found that practitioners have a very broad perception for this term, ranging from flows of business processes to directives for calling external system interfaces. We identified 27 types of rules, which are typically captured as a free text in requirements documents and other project documentation. Practitioners stated the need to capture this tacit form of domain knowledge and to trace it to other artifacts as it impacts all activities in a software engineering project. We distill our results in 17 findings and discuss the implications for researchers and practitioners.
Assisting Engineers in Switching Artifacts by using Task Semantic and Interac...Walid Maalej
Abstract Recent empirical studies show that software engineers use 5 tools and 14 artifacts on average for a single task. As development work is frequently interrupted and several simultaneous tasks are performed in parallel, engineers need to switch many times between these tools and artifacts. A lot of time gets wasted in repeatedly locating, reopening or selecting the right artifacts needed next. To address this problem we introduce Switch!, a context-aware artifact recommendation and switching tool. Switch! assists engineers in switching artifacts based on the type of the development task and the interaction history.
Context aware software engineering and maintenance: the FastFix approachWalid Maalej
Context consists of all events which can be observed or interpreted. In knowledge work it includes the actions of the user, the reaction of the applications, and the artifacts concerned. In this talk, we introduce the FastFix approach to context-awareness in software engineering and maintenance. We show how context enables remote software maintenance, as well as a systematic involvement of end users in software evolution. We also discuss other applications of context including personal productivity management and knowledge sharing amongst developers. The main research challenges include the modeling, sensing, sessionization, aggregation, and comparison of context, as well as the protection of the user's privacy.
Agile2014 Report: As a Speaker and a Reporter of the latest Agile in the world Rakuten Group, Inc.
This is a flash report of Agile2014 by Hiroyuki Ito.
「Agile2014」の参加レポート(速報版)です。
Agile2014
http://agile2014.agilealliance.org/
Please feel and enjoy atmosphere of the latest Agile :)
3 Steps to Create a Habit of User Research on Your Product Teamvalidately
Webinar slides for Sarah Doody's MasterClass on creating a habit of research on your product team.
Video presentation at:
https://youtu.be/EKjWOvLb8G8
In this free masterclass you'll learn:
- The 3 types of research you should be doing each quarter to gather critical insights to form your product decisions.
- How to build what people want and avoid the expensive "re-work" that often happens after you launch.
- How to tailor your research to your company's timelines and budgets.
- How to empower other members of your team to do more research.
- Free Trello Board: Copy Sarah's "Quarterly Research Toolkit" Trello Board to plan your team's research.
Sarah Doody is a user experience designer, consultant, and writer. She is based in New York, NY and works with clients worldwide.
Preparing for an exam can be stressful and time-consuming, but it doesn't have to be. There’s no need to stress out or cram. By doing a couple simple things ahead of time, you can ensure that you are confident and ready for anything that comes up on the test.
Presented by Jess Orr
We will cover topics including:
A3 Thinking: A Quick Refresher
When to Use an A3 vs. Other Tools
How to Engage Others in the Process
Change Management 101
The Hardest Part: Sustaining the Gains
Hosted by KaiNexus
About the Presenter:
Jess Orr
Jess is a continuous improvement thinker and practitioner with 10+ years experience in a variety of industries, including automotive at Toyota. She holds a bachelor's degree in mechanical engineering from Virginia Tech and two Six Sigma Black Belt certifications.
In her current role, Jess applies her passion for people and processes to empower her fellow employees to make impactful and sustainable improvements. You can connect with her on LinkedIn. Her website and blog can be found at www.yokotenlearning.com.
EffectiveUI's Ari Weissman (Lead Experience Architect) and Lys Maitland (Senior Experience Planner) spoke at Denver Startup Week 2016. Discussion description:
Test early, test often.
It’s a mantra that’s been proven successful time and again when it comes to innovation and design. So why aren’t you doing it? In the start-up world, when everything is moving so quickly, it can be easy to overlook or postpone collecting feedback from real people because of cost, time, or lack of preparation. Don’t let those things stop you. Valid data can be captured cheaply, quickly, and with half-finished products and strategies.
This talk will cover:
What is user testing and why is it important
How to plan for user testing
What are ways to make testing cheaper
What are ways to make testing quicker
How to test with different fidelities of concept and design
How to collect data more frequently
Opportunities for getting the whole team engaged
What to do with the insights/outcomes of research
Work descriptions are informal notes taken by developers to summarize work achieved in a particular session. Existing studies indicate that maintaining them is a distracting task, which costs a developer more than 30 min. a day. The goal of this research is to analyze the purposes of work descriptions, and find out if automated tools can assist developers in efficiently creating them. For this, we mine a large dataset of heterogeneous work descriptions from open source and commercial projects. We analyze the semantics of these documents and identify common information entities and granularity levels. Information on performed actions, concerned artifacts, references and new work, shows the work management purpose of work descriptions. Information on problems, rationale and experience shows their knowledge sharing purpose. We discuss how work description information, in particular information used for work management, can be generated by observing developers' interactions. Our findings have many implications for next generation software engineering tools.
Paper: Walid Maalej and Hans-Jörg Happel, Can Development Work Describe Itself? In Proceedings of the 7th IEEE Conference on Mining Software Repositories, IEEE CS, 2010.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
1. OMG
-‐
I
Need
a
Study
for
my
PhD
:-‐
Walid
Maalej
–
Feb
2014
–
Kiel
–
@maalejw
2. Summary
1
2
3
Convince
you
with
the
potenHals
of
empirical
research
in
so6ware
engineering
Introduce
important
terminology
as
index
to
find
more
knowledge
Share
my
experience
(best
prac@ces
and
piAalls)
and
discuss
it
with
you
2
3. Outline
of
my
Talk
1
MoHvaHon
2
Research
QuesHons
3
!!
Research
Methods
!!
4
[Data
CollecHon
and
Analysis]
3
4. What
is
Empirical
Research?
+
Data
ObservaHon
SystemaHc
The
“new
standard”
in
the
SE
community!
4
5. Other
Research
Approaches
Engineering-‐driven
AnalyHcal
SuperDuper
Super1
…
MathemaHcal,
formal
MustBe1
Super2
MustBe2
MustBe3
AnecdoHc
∧α
α≥√∞
5
6. Goal
of
Empirical
Studies
Explore
Understand
phenomena
and
iden@fy
problems
Evaluate
Check
and
improve
solu@ons,
measure
impact
6
7. Outline
of
my
Talk
1
MoHvaHon
2
Research
QuesHons
3
Research
Methods
4
[Data
CollecHon
and
Analysis]
7
8. Define
*What*
Do
you
Want
to
Study!
Which
strategies
(including
steps
and
ac@vi@es)
do
developers
use
in
order
to
comprehend
so6ware?
Which
sources
of
informaHon
do
developers
use
during
program
comprehension?
Which
tools
do
developers
use
when
understanding
programs
and
in
which
way?
8
9. Iterate!
Rephrase
When
you
are
Done!
Data
Analysis
End
Data
CollecHon
Methods
QuesHons
Start
9
10. Be
Concrete
and
Precise!
Bad
Good
[NO
research
ques@ons!]
What
is
the
impact
of
How
can
we
make
so6ware
informa@on
dispersion
on
developments
more
efficient?
development
produc@vity?
How
do
developers
perceive
tool
integra@on?
How
do
developers
assess
tool
integra@on
as-‐is?
10
11. Try
to
not
solve
the
“World
Hunger
Problem”
in
your
PhD!
11
12. Common
Types
of
QuesHons...
Type
Example
What/Which
Which
tools
do
developers
use
during
a
bug
fixing
task?
How
How
to
developers
proceeds
to
fix
a
bug?
Why
Why
are
agile
methods
popular
in
industry?
When
When
are
code
clones
useful?
How
much
How
o6en
How
frequently
do
developers
need
to
know
the
steps
to
reproduce
a
bug?
12
13. Outline
of
my
Talk
1
MoHvaHon
2
Research
QuesHons
3
Research
Methods
4
[Data
CollecHon
and
Analysis]
13
15. Example:
Tool
IntegraHon
Revisited
(Maalej
2009)
Phase
2
Explanatory,
QuanHtaHve
Phase
1
Exploratory,
QualitaHve
1
3
4
Semi-‐structured
face-‐to-‐face
interviews
with
engineers
Field
experiments
2
with
engineers
Content
analysis
of
project
Online
ques@onnaire
with
professionals
ar@facts
15
16. How
it
was
Presented
in
the
Paper
Research
Questions
Phase 1: Exploratory, Qualitative
Repeated
Interviews
Content
Analysis
Phase 2: Explanatory, Quantitative
Field
Experiment
Questionnaire
1. As-is assessment
2. Problems
3. Practices
4. Requirements
5. Appropriateness
16
30. PerfecHon
Your
QuesHons!
1.
Remove
unclear
quesHons!
2.
Put
the
least
important
last!
3.
Match
quesHons
with
answers
4.
Think
about
the
outliers
30
31. Exclude
Non-‐Serious
Subjects!
• Filter
incomplete
answers?
• Use
“check”
ques@ons
• Remove
"noise
answers”
• Random
order
of
the
ques@ons
and
answers
• ….
31
32. MoHvate
and
Give
IncenHves
1.
Share
results
(informaHon
and
tools)
3.
Offer
dedicated
analysis
2.
Raffle
giqs
4.
Show
the
importance
of
your
research
32
33. Use
Likert
or
SemanHc
Scales
for
Flexibility!
Problems encountered due to missing knowledge
Frequency
Often - Usually
Count (%)
Mode
(70,1%)
Never
Usually
Fixing a bug
When I am trying to understand other’s code
I need to know...
Never/
Rarely
f
What was the coder’s intention as he wrote this
Seldom
monthly
1333
Often
weekly
1267
Usually
daily
I don’t
know
1154
747
Reusing a component
Often
(69,8%)
Often
1153
Whether
1135
Problems encountered due to missing knowledge
**
Frequency
Never
1035
Often
Often - Usually
791
Count (%)
Mode
Often
681
Seldom
Fixing a bug
Understanding other s code (e.g. for review or documentation)
(70,1%)
(59,6%)
Usually
Often
1333
1190
1267
1025
1154
733
**
Seldom
747
677
f
Often
Often
Seldom
Reusing a component
538
(69,8%)
Whether
Implementing a feature
1153 (59,0%)
Often
Seldom
Often
1135
948
1035
862
791
839
**
Often
Often
681
829
Seldom
Often
33
36. 1333
1267
Focus
on
Quasi-‐ExperimentaHon
Instead
of
RepresentaHve
Summaries!
f
1154
747
Often
Reusing a component
(69,8%)
Often
1153
Whether
1135
1035
Often - Usually
791
Count (%)
Frequency
Never
**
Often
Mode
Often
681
Problems encountered due to missing knowledge
Seldom
Fixing a bug
Understanding other s code (e.g. for review or documentation)
(70,1%)
(59,6%)
Usually
Often
1333
1190
1267
1025
1154
733
**
Seldom
747
677
f
Often
Often
Seldom
Reusing a component
538
(69,8%)
Whether
Implementing a feature
1153 (59,0%)
Often
Seldom
Often
1135
948
1035
862
791
839
**
3-5 years
6-10 years
Small
1-5 employees
0-2 other
25%
Understanding years s code (e.g. for review or documentation)
**
35%
16%
**
37%
Large
30%
>500 employees
28%
28%
Medium
>10 years
50-500 employees
(A) Development Experience
(B)
** Size of Employer
Both
27%
66%
Private
Closed Source
Often
Often
681
829 1-7 people
717
(59,6%)
1190
Public 667
Open Source
1025
7%
(C) Types of Projects
733
677
(D)
>30 people
1%
Often 16-30 people
8-15 people Seldom
5%
23%
Seldom
Collaborators Count incl. Team
Seldom
(59,0%)
948
Often
71% Seldom
Often
538
Implementing a feature
Seldom
Often
Often
36
37. OBSERVATIONS
Are
rather…
• Objec@ve
• Quali-‐/quan@ta@ve
• Exploratory
• With
users
and
>1
researchers!
37
38. Observe
Less
but
in
RealisHc
Environment
How
many
subjects
do
we
need?
38
39. Use
an
ObservaHon
Template!
On the Comprehension of Program Comprehension
36:5
Table II. Excerpt from the Observation Protocol of Participant P5 (Observational Study)
Daytime
...
10:19
Relative time
...
00:27
10:20
10:24
10:26
10:28
00:28
00:32
00:34
00:36
10:29
00:37
10:30
10:31
...
00:38
00:39
...
Observation/ Quote
...
Read Jira ticket
Comment: “this sounds like the ticket from yesterday”
Refresh source code repository
Publish code to local Tomcat
Debug code in local Tomcat
Open web application in browser and enter text into
form fields
Change configuration in XML file content.xml
Exclamation: “not this complicated xml file again”
Publish changes to local Tomcat
Debug local Tomcat
...
Postponed questions
...
What information
considered?
Why debugging?
How known what to
change?
...
A single observation session lasted for 45 minutes, leaving another 45 minutes for
the interview. We did not want to spend more than 90 minutes because concentration of both observed developer and observer decreases over time. In each session, one
participant was observed and interviewed by one observer.
Prepare
codes
for
observaHons!
2.2.2. Online Survey. The survey focused on knowledge consumed and produced in soft39
ware comprehension. Starting from the findings of several recent studies [Ko et al.
2007; Sillito et al. 2008; Fritz and Murphy 2010], we assumed that knowledge needs
43. Talk
About
Your
ObservaHon
(Peer
Debriefing)!
• This
helps
to
idenHfy
the
relevant
observaHons
and
to
group
observaHon
• Avoid
talking
to
subjects
during
observaHon
43
47. What
is
Reliability?
Reliability
measure
correctly
and
reduce
systema@c
errors
If
redone
the
results
will
be
the
same
Validity
measure
the
right
thing
and
reduce
risk
for
assump@ons
Results
can
be
generalized
to
popula@on
Why
is
Content
Analysis
Reliable?
47
48. Develop
a
Coding
Guide
API$Knowledge$Coding$Guide$Version$7.2$
You!will!be!presented!with!documentation+blocks!extracted!from!API!reference!documentation!(Javadocs!and!the!like).!For!each!
block,!you!will!be!also!presented!with!the!name!of!its!corresponding!package/namespace,!class,!method,!or!field.!Your!task!is!to!read!
each!block!carefully!and!evaluate!whether!the!block!contains!knowledge!of!the!different!types!described!below.!You!will!need!to!
evaluate!whether!each!block!contains!knowledge+of+each+different!type.!Rate!the!knowledge!type!as!true!only!if!there!is!clear!
evidence!that!knowledge!of!that!type!is!present!in!the!block.!If!you!hesitate!about!whether!or!not!to!rate!a!knowledge!type!as!true,!
leave!it!as!false.!
Do!not!evaluate!automatically!generated!information!such!as!the!declaration!of!an!element!(e.g.!extends!MyInterface),!or!generated!
links!in!“specified!by”.!Only!evaluate!human!created!documentation!in!the!block!(see!last!section!in!page!5!for!more!details).!
Read!the!following!description!very!carefully.!It!explains!how!to!rate!each!knowledge!type!for!a!given!block.!
Knowledge)Types)
Functionality)and)Behavior)
• Describe
the
coding
task
• Give
clear
defini@ons
and
how
to
interpret
the
data
• Give
examples
Describes!what!the!API!does!(or!does!not!do)!in!terms!of!functionality!or!features.!The!block!describes!what+happens!when!the!API!
is+used!(a!field!value!is!set,!or!a!method!is!called).!This!also!includes!specified!behavior!such!as!what!an!element!does,!given!special!
input!values!(for!example,!null)!or!what!may!cause!an!exception!to!be!raised.!!
Functionality!and!behavior!knowledge!can!also!be!found!in!the!description+of+parameters!(e.g.,!what!the!element!does!in!response!
to!a!specific!input),!return+values!(e.g.,!what!the!API!element!returns),!and!thrown+exceptions.!!
•
•
Detects&stream&close&and¬ifies&the&watcher&
Obtains&the&SSL&session&of&the&underlying&connection,&if&any.&If&this&connection&is&open,&and&the&underlying&socket&is&an&SSLSocket,&
the&SSL&session&of&that&socket&is&obtained.&This&is&a&potentially&blocking&operation.&
Only+rate+this+type+as+true+if+the+block+contains+information+that+actually+adds+to+what+is+obvious+given+the+complete+signature+of+
the+API+element+associated+with+the+block.!If!a!description!of!functionality!only!repeats!the!name!of!the!method!or!field,!it!does!not!
contain!this!type!of!knowledge!and!you!should!rate!it!as!false,!and!instead!rate!the!knowledge!type!nonMinformation!as!true.!For!
example,!this!would!be!the!case!if!the!documentation!for!a!method!called!getTitle!was!
•
Returns&the&title.&
Similarly!for!constructors,!if!the!documentation!simply!states!“Constructs&a&new&X”,!“Instantiates&a&new&object”,!or!something!similar!
the!value!is!false!(with!nonMinformation!coded!as!true).!In!some!cases!nonMinformation!will!be!phrased!to!look!like!a!description!of!
functionality,!for!examples!with!sentences!that!start!with!verbs!like!“gets”,!“adds”,!“determines”,!“initializes”.!Carefully!read!the!
name!and!signature!of!the!API!element!and!only!assign!a!value!of!true!for!this!knowledge!type!if!the!block!adds!something!to!the!
description!of!the!element.!
However,!if!any!other!details!are!provided,!rate!this!type!as!true.!For!example:!
• Creates&a&new&MalformedChallengeException&with&a&null&detail&message.&
Should!get!a!value!of!true!because!of!the!additional!information!about!the!value!of!the!message!field.!
Mentioning!that!a!value+can+be+obtained!from!a!field,!property,!or!getter!method!does!not!constitute!a!description!of!functionality,!
except!the!API!performs!some!additional!functions!when!the!value!is!accessed.!For!example,!the!block!below!does!not!represent!a!
description!of!functionality.!The!NonMinformation!type!for!this!block!should!be!rated!as!true.!
•
[LoggerDescription.Verbosity&Property]&Gets&the&verbosity&level&for&the&logger.&
Note+IMPORTANT:!Description!of!functionality!is!not!limited!to!the!functionality!of!the!element!associated!with!the!block,!but!the!
API!as!a!whole.!However,+if+the+block+explains+a+sequence+of+method+calls+or+creation+of+particular+objects+(e.g.+events)+code+this+
as+ControlJflow.+For!example,!if!setting!the!value!of!a!field!results!in!some!perceived!behavior!by!the!framework,!this!knowledge!
counts!as!functionality.!If!the!block!describes!a!resulting!sequence!of!method!calls!or!events!fired,!this!is!control!flow.!If!the!block!
contains!both,!then!both!should!be!coded!as!true.!!
!
[Maalej
&
Robillard
2013]
[Pagano
&
Maalej
2013]
1!
48
51. 1264
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 39, NO. 9, SEPTEMBER 2013
Patterns of Knowledge
in API Reference Documentation
Walid Maalej and Martin P. Robillard
Abstract—Reading reference documentation is an important part of programming with application programming interfaces (APIs).
Reference documentation complements the API by providing information not obvious from the API syntax. To improve the quality of
reference documentation and the efficiency with which the relevant information it contains can be accessed, we must first understand
its content. We report on a study of the nature and organization of knowledge contained in the reference documentation of the
hundreds of APIs provided as a part of two major technology platforms: Java SDK 6 and .NET 4.0. Our study involved the development
of a taxonomy of knowledge types based on grounded methods and independent empirical validation. Seventeen trained coders used
the taxonomy to rate a total of 5,574 randomly sampled documentation units to assess the knowledge they contain. Our results provide
a comprehensive perspective on the patterns of knowledge in API documentation: observations about the types of knowledge it
contains and how this knowledge is distributed throughout the documentation. The taxonomy and patterns of knowledge we present in
this paper can be used to help practitioners evaluate the content of their API documentation, better organize their documentation, and
limit the amount of low-value content. They also provide a vocabulary that can help structure and facilitate discussions about the
content of APIs.
Index Terms—API documentation, software documentation, empirical study, content analysis, grounded method, data mining, pattern
mining, Java, .NET
Ç
1 INTRODUCTION
A
programming interfaces (APIs) enable the
reuse of libraries and frameworks in software development. In essence, an API is a contract between the
component providing a functionality and the component
using that functionality (the client). The syntactic information is, in all but the most trivial cases, insufficient to allow a
developer to correctly use the API in a programming task.
First, interfaces abstract complex behavior, knowledge of
which may be necessary to understand a feature. Second,
even if the behavior of a component could be completely
specified by its interface, developers often need ancillary
knowledge about that element: how it relates to domain
terms, how to combine it with other elements, and so on
[30]. This knowledge is generally provided by documentation, in particular by the API’s reference documentation.
We define API reference documentation as a set of
documents indexed by API element name, where each
document specifically provides information about an
element (class, method, etc.). For example, the API
documentation of the Java Development Toolkit (JDK) is a
set of web pages, one for each package or type in the API.
Although many forms of API documentation exist, there is
usually a clear distinction between reference documentation
and other forms of documentation with a more pedagogical
intent (e.g., tutorials, books, and FAQs).
PPLICATION
Reference documentation is a necessary and significant
part of a framework. For example, the reference documentation of the JDK 6 (SE and EE) totals over three million
words, or six times the length of Tolstoy’s epic novel War
and Peace. Reference documentation also plays a crucial role
in how developers learn and use an API, and developers
can have high expectations about the information they
should find therein [14], [30]. Empirical studies have
described how developers have numerous and varied
questions about the use of APIs (see Section 8). Efficient
representation and access of knowledge in API reference
documentation is therefore a likely factor for improving
software development productivity.
Most technology platforms exposing APIs provide a
documentation system with a uniform structure and lookand-feel for presenting and organizing the API documentation. For example, Java APIs are documented through
Javadocs, documentation for Python modules can be
generated with the pydoc utility, and Microsoft technologies, whose documentation is available through the MSDN
website, follow the same look-and-feel. Unfortunately, no
standard and a few conventions exist regarding the content
of reference documentation. For example, an early article
explains the rationale behind Javadocs and gives a set of
conventions for what should and should not be part of
Javadocs [20]. In practice, however, these conventions are
51
56. In
Soqware
Engineering…
• Automa@on
tools
transform
A
-‐>
B
• We
o6en
have
A
and
B!
• We
can
use
the
tool
with
A
and
check
whether
the
output
is
B
56
57. SomeHmes:
We
Have
the
Data…
What
Can
We
Do
With
It!
•
•
•
•
Examples:
Revision
history
>>
bug
predic@on
Bug
data
>>
link
to
source
code
Interac@on
data
>>
??
Mixing
the
exploraHon
and
evaluaHon
task
57
58. EXPERIMENT/
USER
STUDY
Are
rather…
• Quan@ta@ve
• Evalua@ve
• Involve
users
• In
Lab
seqng?
58
59. Build
a
Control
and
an
Experimental
Group
With
and
without
tool
(Aspirin).
Problems?
59
64. Outline
of
my
Talk
1
MoHvaHon
2
Research
QuesHons
3
Research
Methods
4
[Data
CollecHon
and
Analysis]
64
65. Data
CollecHon
is
Expensive
Probably
the
most
painful
part,
expensive
to
redo!
Try
to
use
and
reuse
exisHng
data
before
collec@ng
new
data!
Reused
data
from
related
work,
ask
authors!
Plenty
of
data
in
open
source
repositories!
65
67. Avoid
ReporHng
too
Many
Numbers
Example
removed
for
privacy
reason!
• Use
appendix
• Share
data
online
67
68. Use
Simple
VisualizaHon!
Channels Used to Access Knowledge
Never
Seldom
-1
Often
0
1
Usually
2
Other People
Count
Often - Usually
Mean
1170 (81,6%)
-2
1,0
Project Artifacts
Issue and bug reports
895
(63,1%)
0,4
API description
1076 (75,8%)
0,9
Comments in source code
990
(69,3%)
0,6
Commit messages
525
(38,1%)
-0,4
Personal Artifacts
Personal notes (logbooks, diaries, post-it s)
339
(24,1%)
-0,9
Personal e-mails
906
(42,8%)
-0,2
Work item/task descriptions (To-dos)
710
(50,3%)
0,0
Intranet
437
(31,8%)
-0,7
Project or organization Wiiki
532
(38,5%)
-0,5
Experience Databases / groupware systems
181
(13,4%)
-1,0
Forums and mailing lists
742
(52,7%)
0,1
Web search engines (e.g. Google, Yahoo)
1170 (81,4%)
1,0
Public documentation / web pages I know
1081 (76,0%)
0,9
Knowledge Management Systems
Internet
Problems encountered due to missing knowledge
Frequency
Often - Usually
Count (%)
Mode
(70,1%)
Never
Usually
Fixing a bug
1333
1267
f
1154
747
Reusing a component
Whether
Often
(69,8%)
1153
1135
68
Often
69. Discuss
Findings
and
LimitaHons!
Empir Software Eng
Fig. 9 Dependencies between
blogs and commits in terms of
time
between commit messages and blog post on this result, we calculated the average
time period for each grade. Figure 9 shows the results. The strength of dependency
between a commit message and a blog post decreases with an increasing time period
between the commit and the post.
To summarize, developers also use blogs to summarize their work. They are more
likely to publish information about recent activities they have performed than about
old activities.
6 Discussion
In this section we highlight three main findings. First, we discuss the importance of
blogs as a project medium and blogging as a project function. Second, we discuss
the purpose of blogging in open source software projects based on our results, differentiating between blogging committers and other stakeholders. Finally, we derive
insights for future research, in particular how to integrate blogs into development
environments and blogging into developers’ workflows as well as how to dissolve
boundaries between developers and other stakeholders.
6.1 Blogging is a Project Function
In all studied open source communities we observed regular and frequent blogging
activities since several years and across many releases. This is not surprising, as
blogs became one of the most popular media for sharing and accessing software
engineering knowledge in the last years (Parnin and Treude 2011). While individual
developers only blog occasionally, the community as a whole constantly shares
information and produces an average of up to six blog posts per day. These posts
are written equally by committers as well as other community members.
Unlike committers in large open source projects, which have been studied quite
thoroughly (e.g. Mockus et al. 2002), other community members are less researched.
This non-committing group includes not only actual users of the software, but also
other stakeholders such as evangelists, community coordinators, companies’ proxies,
and managers. Evangelists might have created the project long time ago. They have
large experience and special interests in the success of an open source project, and
therefore advertise it and demonstrate its usefulness. Managers and coordinators
might be hired by the community to plan releases or organize conferences. Crowston
Empir Software Eng
topics and their popularities. Developers as well as other stakeholders discuss about
requirements, implementation, and community aspects. On the one hand, developers
report about their recent development activities to communicate their project work
to a broad audience, including users and other stakeholders. On the other hand
users and other stakeholders seem to have their blogging peak time shortly after new
versions are released—reporting on their experiences with the new changes. Utilizing
these experiences and the volunteered resources provides a huge benefit for software
projects. We claim that communities should be created systematically and integrated
in software systems utilizing social media such as blogs. In (Maalej and Pagano 2011)
we envision a software framework that enables the development and maintenance of
such social software.
7 Results Validity
7.1 External Validity
Although our study was neither designed to be generalizable nor representative
for all developers and communities, we think that most of the results have a high
degree of generalizability, in particular for large open source communities. At
the design time of the study, we knew neither the entire population of software
development blogs, nor of blogging developers. Therefore we were unable to study
a random representative sample of blogs and bloggers. Instead, we aimed at a rather
exploratory, hypothesis-generating study to better understand blogs, their usage, and
role in the development project. The four studied communities should rather be
interpreted as four cases than as one homogeneous dataset. However, the careful
selection of these communities, their leading role in open source software, and the
large number of their blogs and bloggers give confidence that many of the results
apply for other comparable communities as well.
We think that our results are representative for each of the studied communities
due to the following reasons.
–
–
–
–
Our datasets include all community blogs from the last seven years.
We conducted statistical tests to check the statistical significance of our results
and exclude hazard factors.
We got similar results using different analysis methods (e.g. descriptive statistics
and topic analysis).
In two of the studied communities (Eclipse and GNOME), we were able to
contact three senior active members. Among them were both committers, who
had contributed for around three to four years (92 to over 600 commits each),
and evangelists, who had been involved at least three years in the community.
While discussing the results in detail they confirmed the findings based on their
experiences.
Nevertheless, there are three limitations which should be considered when interpreting the results. First, for Eclipse and GNOME we were unable to analyze the blogs of
69