This document summarizes research on GitHub from academic publications indexed in Scopus:
- Over 1,000 scholarly publications per year mention GitHub in the title, keywords or abstract, with computer science, bioinformatics and mathematics most common subjects.
- The US, UK, China, Germany, Canada and France publish the most research on GitHub.
- The earliest and most cited publication is from 2012 and examines transparency and collaboration on GitHub.
Abstract—Context: Interactions between individuals and their
participation in community activities are governed by how individuals
identify themselves with their peers. Although software
developers collaborate using online peer production sites, this
phenomenon has not been studied across online peer production
sites in software engineering. Knowledge of this may help tool
builders and researchers gain better insights about developers’
expectations for online peer production sites.
Objective: We want to investigate such behavior for developers
while they are learning and contributing on socially collaborative
environments, specifically code hosting sites and question/answer
sites. In this study, we investigate the following questions about
advocates, developers who can be identified as active learners
and well-rounded community contributors. Do advocates flock
together in a community? How do flocks of advocates migrate
within a community? Do these flocks of advocates migrate beyond
a single community?
Method: To understand such behavior, we identified 12,578
common advocates across a code hosting site - GitHub and a
question/answering site - Stack Overflow. These advocates were
involved in 1,549 projects on GitHub and were actively asking
114,569 questions and responding with 408,858 answers and
1,001,125 comments on Stack Overflow. We performed an indepth
empirical analysis using social networks to find the flocks
of advocates and their migratory pattern on GitHub, Stack
Overflow, and across both communities.
Results: We found that 7.5% of the advocates create flocks
on GitHub and 8.7% on Stack Overflow. Further, these flocks
of advocates migrate on an average of 5 times on GitHub and
2 times on Stack Overflow. In particular, advocates in flocks of
size two migrate more frequently than larger flocks. However, this
migration behavior was only common within a single community.
Conclusions: Our findings indicate that advocates’ flocking and
migration behavior differs substantially from the ones found in
other social environments. This suggests a need to investigate the
factors that demotivate the flocking and migration behavior of
advocates and ways to enhance and integrate support for such
behavior in collaborative software tools.
GitHub; Stack Overflow; Social Network Analysis; Flocking;
Migration; Developers; Advocates.
https://sites.google.com/site/ijcsis/vol-18-no-6-jun-2020
Investigating Software Engineering Artifacts in DevOps Through the Lens of Bo...Christoph Matthies
Slides for the talk on "Investigating Software Engineering Artifacts in DevOps Through the Lens of Boundary Objects" at the International Conference on Evaluation and Assessment in Software Engineering (EASE) conference 2023.
https://conf.researchr.org/details/ease-2023/ease-2023-research/2/Investigating-Software-Engineering-Artifacts-in-DevOps-Through-the-Lens-of-Boundary-O
Christoph Matthies, Robert Heinrich, and Rebekka Wohlrab. 2023. "Investigating Software Engineering Artifacts in DevOps Through the Lens of Boundary Objects". In Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering (EASE '23). Association for Computing Machinery, New York, NY, USA, 12–21. https://doi.org/10.1145/3593434.3593441
Collaboration between Software Developers and the Impact of ProximityDawn Foster
Poster Presented at XXXVII Sunbelt Conference
of The International Network For Social Network Analysis (INSNA)
May 30th, 2017 – June 4th, 2017 Beijing, China
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdfMatthias Trapp
Presentation of research paper "CodeCV: Mining Expertise of GitHub Users from Coding Activities" at the 22nd IEEE International Working Conference on Source Code Analysis and Manipulation in Limassol, Cyprus.
Presentation at the Big Software in the Run summer school: Diversity and inclusion in open source software communities. Topics include impact of gender and tenure diversity on productivity, diversity measurement, code of conduct, inclusion.
Abstract—Context: Interactions between individuals and their
participation in community activities are governed by how individuals
identify themselves with their peers. Although software
developers collaborate using online peer production sites, this
phenomenon has not been studied across online peer production
sites in software engineering. Knowledge of this may help tool
builders and researchers gain better insights about developers’
expectations for online peer production sites.
Objective: We want to investigate such behavior for developers
while they are learning and contributing on socially collaborative
environments, specifically code hosting sites and question/answer
sites. In this study, we investigate the following questions about
advocates, developers who can be identified as active learners
and well-rounded community contributors. Do advocates flock
together in a community? How do flocks of advocates migrate
within a community? Do these flocks of advocates migrate beyond
a single community?
Method: To understand such behavior, we identified 12,578
common advocates across a code hosting site - GitHub and a
question/answering site - Stack Overflow. These advocates were
involved in 1,549 projects on GitHub and were actively asking
114,569 questions and responding with 408,858 answers and
1,001,125 comments on Stack Overflow. We performed an indepth
empirical analysis using social networks to find the flocks
of advocates and their migratory pattern on GitHub, Stack
Overflow, and across both communities.
Results: We found that 7.5% of the advocates create flocks
on GitHub and 8.7% on Stack Overflow. Further, these flocks
of advocates migrate on an average of 5 times on GitHub and
2 times on Stack Overflow. In particular, advocates in flocks of
size two migrate more frequently than larger flocks. However, this
migration behavior was only common within a single community.
Conclusions: Our findings indicate that advocates’ flocking and
migration behavior differs substantially from the ones found in
other social environments. This suggests a need to investigate the
factors that demotivate the flocking and migration behavior of
advocates and ways to enhance and integrate support for such
behavior in collaborative software tools.
GitHub; Stack Overflow; Social Network Analysis; Flocking;
Migration; Developers; Advocates.
https://sites.google.com/site/ijcsis/vol-18-no-6-jun-2020
Investigating Software Engineering Artifacts in DevOps Through the Lens of Bo...Christoph Matthies
Slides for the talk on "Investigating Software Engineering Artifacts in DevOps Through the Lens of Boundary Objects" at the International Conference on Evaluation and Assessment in Software Engineering (EASE) conference 2023.
https://conf.researchr.org/details/ease-2023/ease-2023-research/2/Investigating-Software-Engineering-Artifacts-in-DevOps-Through-the-Lens-of-Boundary-O
Christoph Matthies, Robert Heinrich, and Rebekka Wohlrab. 2023. "Investigating Software Engineering Artifacts in DevOps Through the Lens of Boundary Objects". In Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering (EASE '23). Association for Computing Machinery, New York, NY, USA, 12–21. https://doi.org/10.1145/3593434.3593441
Collaboration between Software Developers and the Impact of ProximityDawn Foster
Poster Presented at XXXVII Sunbelt Conference
of The International Network For Social Network Analysis (INSNA)
May 30th, 2017 – June 4th, 2017 Beijing, China
CodeCV - Mining Expertise of GitHub Users from Coding Activities - Online.pdfMatthias Trapp
Presentation of research paper "CodeCV: Mining Expertise of GitHub Users from Coding Activities" at the 22nd IEEE International Working Conference on Source Code Analysis and Manipulation in Limassol, Cyprus.
Presentation at the Big Software in the Run summer school: Diversity and inclusion in open source software communities. Topics include impact of gender and tenure diversity on productivity, diversity measurement, code of conduct, inclusion.
Operationalisation of Collaboration Sunbelt 2015Dawn Foster
The operationalisation of collaboration: in search of a definition and its consequences on
analysis
Collaboration has been defined in numerous ways. Researchers interested in collaboration at the
individual or organizational level need to pay special attention to the adoption of a specific definition, as
this is likely to have major implications for the research design and outcomes. With respect to
collaboration within open source software projects, this presentation has two objectives. Firstly, this
presentation will investigate a wide variety of definitions of collaboration from the existing literature.
Secondly, the presentation will look at theoretically informed selection of a definition. Throughout the
presentation, specific emphasis will be put on the implications of adoption of several definitions of
collaboration for the application of Social Network Analysis to the study of open source software,
particularly considering data collection and analysis. Open source software is developed in the open
where anyone can view the source code and anyone with the knowledge to do so can contribute to the
project. Because people from around the world work on these projects together using online tools, it is
a relevant setting for studying collaboration. An interesting aspect of open source collaboration is that
private resources from individuals and organizations are used to develop software that is released as a
public good. Social Network Analysis can be used to understand the network relationships between the
individuals who develop this software. Given the interest in collaboration from researchers from different
backgrounds and disciplines, similar research is likely to produce considerations to stimulate further
thoughts about definitions of collaboration in several domains and research settings.
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Dawn Foster
This presentation looks at job changes of software developers within an open source software community using
relational predictors of job change activity to model the actions of the actors involved. Interactions with other actors
on mailing lists and in software contributions will be used as predictors.
Open source software is developed in the open where anyone can view the source code and anyone with the knowledge
to do so can contribute to the project. Because people from around the world work on these projects together using
online tools with publicly accessible interactions between people, it is a relevant setting for studying job changes
using Social Network Analysis to understand and model the network relationships between individuals both before
and after a job change.
For Information about technology and the Future technology
to read the article click links given below
https://www.informationtechnologys.world
https://bit.ly/3oUiNlr
Slides for:
"Software Citation in Theory and Practice," by Daniel S. Katz and Neil P. Chue Hong (published paper - https://doi.org/10.1007/978-3-319-96418-8_34; preprint - https://arxiv.org/abs/1807.08149), presented at International Congress on Mathematical Software (ICMS 2018)
Abstract. In most fields, computational models and data analysis have become a significant part of how research is performed, in addition to the more traditional theory and experiment. Mathematics is no exception to this trend. While the system of publication and credit for theory and experiment (journals and books, often monographs) has developed and has become an expected part of the culture, how research is shared and how candidates for hiring, promotion are evaluated, software (and data) do not have the same history. A group working as part of the FORCE11 community developed a set of principles for software citation that fit software into the journal citation system, allow software to be published and then cited, and there are now over 50,000 DOIs that have been issued for software. However, some challenges remain, including: promoting the idea of software citation to developers and users; collaborating with publishers to ensure that systems collect and retain required metadata; ensuring that the rest of the scholarly infrastructure, particu- larly indexing sites, include software; working with communities so that software efforts count; and understanding how best to cite software that has not been published.
In this presentation we explore how the CI/CD landscape on GitHub has evolved since the introduction of GitHub Actions. This presentation is based on several peer-reviewed articles published in 2022 and 2023.
Understanding everyday users’ perception of socio-technical issues through s...Ahreum lee
I gave a talk at ImagineXLab, Seoul, Korea.
In this presentation, I would like to share my recent works that have been explored sociotechnical issues through social media data.
1) /r/Assholedesign: Online conversation about ethical concerns (ACM DIS 20' Honorable Mention Award)
2) /r/Digitalnomad: Current tensions in community-based spaces (ACM CHI 2019 LBW, CSCW 2019)
3) /r/Purdue: Everyday users’ perception of delivery robots on campus (ACM CSCW 2020 LBW)
Towards editorial transparency in computational journalismJennifer Stark
This goes together with a research paper also uploaded here describing practical steps to transparency in computational journalism with two case studies.
3-D geospatial data for disaster management and developmentKeiko Ono
Japan is a high income country at an advanced stage of epidemiological transition. One of its remaining public health challenges is response to natural disasters. This presentation explores the potential of 3-D geospatial data in disaster response and management.
Operationalisation of Collaboration Sunbelt 2015Dawn Foster
The operationalisation of collaboration: in search of a definition and its consequences on
analysis
Collaboration has been defined in numerous ways. Researchers interested in collaboration at the
individual or organizational level need to pay special attention to the adoption of a specific definition, as
this is likely to have major implications for the research design and outcomes. With respect to
collaboration within open source software projects, this presentation has two objectives. Firstly, this
presentation will investigate a wide variety of definitions of collaboration from the existing literature.
Secondly, the presentation will look at theoretically informed selection of a definition. Throughout the
presentation, specific emphasis will be put on the implications of adoption of several definitions of
collaboration for the application of Social Network Analysis to the study of open source software,
particularly considering data collection and analysis. Open source software is developed in the open
where anyone can view the source code and anyone with the knowledge to do so can contribute to the
project. Because people from around the world work on these projects together using online tools, it is
a relevant setting for studying collaboration. An interesting aspect of open source collaboration is that
private resources from individuals and organizations are used to develop software that is released as a
public good. Social Network Analysis can be used to understand the network relationships between the
individuals who develop this software. Given the interest in collaboration from researchers from different
backgrounds and disciplines, similar research is likely to produce considerations to stimulate further
thoughts about definitions of collaboration in several domains and research settings.
Network Relationships and Job Changes of Software Developers at Sunbelt 2016Dawn Foster
This presentation looks at job changes of software developers within an open source software community using
relational predictors of job change activity to model the actions of the actors involved. Interactions with other actors
on mailing lists and in software contributions will be used as predictors.
Open source software is developed in the open where anyone can view the source code and anyone with the knowledge
to do so can contribute to the project. Because people from around the world work on these projects together using
online tools with publicly accessible interactions between people, it is a relevant setting for studying job changes
using Social Network Analysis to understand and model the network relationships between individuals both before
and after a job change.
For Information about technology and the Future technology
to read the article click links given below
https://www.informationtechnologys.world
https://bit.ly/3oUiNlr
Slides for:
"Software Citation in Theory and Practice," by Daniel S. Katz and Neil P. Chue Hong (published paper - https://doi.org/10.1007/978-3-319-96418-8_34; preprint - https://arxiv.org/abs/1807.08149), presented at International Congress on Mathematical Software (ICMS 2018)
Abstract. In most fields, computational models and data analysis have become a significant part of how research is performed, in addition to the more traditional theory and experiment. Mathematics is no exception to this trend. While the system of publication and credit for theory and experiment (journals and books, often monographs) has developed and has become an expected part of the culture, how research is shared and how candidates for hiring, promotion are evaluated, software (and data) do not have the same history. A group working as part of the FORCE11 community developed a set of principles for software citation that fit software into the journal citation system, allow software to be published and then cited, and there are now over 50,000 DOIs that have been issued for software. However, some challenges remain, including: promoting the idea of software citation to developers and users; collaborating with publishers to ensure that systems collect and retain required metadata; ensuring that the rest of the scholarly infrastructure, particu- larly indexing sites, include software; working with communities so that software efforts count; and understanding how best to cite software that has not been published.
In this presentation we explore how the CI/CD landscape on GitHub has evolved since the introduction of GitHub Actions. This presentation is based on several peer-reviewed articles published in 2022 and 2023.
Understanding everyday users’ perception of socio-technical issues through s...Ahreum lee
I gave a talk at ImagineXLab, Seoul, Korea.
In this presentation, I would like to share my recent works that have been explored sociotechnical issues through social media data.
1) /r/Assholedesign: Online conversation about ethical concerns (ACM DIS 20' Honorable Mention Award)
2) /r/Digitalnomad: Current tensions in community-based spaces (ACM CHI 2019 LBW, CSCW 2019)
3) /r/Purdue: Everyday users’ perception of delivery robots on campus (ACM CSCW 2020 LBW)
Towards editorial transparency in computational journalismJennifer Stark
This goes together with a research paper also uploaded here describing practical steps to transparency in computational journalism with two case studies.
3-D geospatial data for disaster management and developmentKeiko Ono
Japan is a high income country at an advanced stage of epidemiological transition. One of its remaining public health challenges is response to natural disasters. This presentation explores the potential of 3-D geospatial data in disaster response and management.
A narrative review of NLP applications to political science
人工知能、機械学習の急速な発展とともに、そうした分析で利用できる「データ」の範囲が拡大しつつある。人が発話・作成した言葉を人工知能が読み解いて、翻訳・要約、さらには特徴・パターンを見つけるなど高度な分析をする「自然言語処理」はすでに多くの分野で実用化されている。この発表では政治学における自然言語処理を用いたこれまでの研究をレビューし、今後の可能性について検討する。Keywords: artificial intelligence, natural language processing (NLP), text mining, political science, data science
US presidential selection: the Electoral College challenged (again)Keiko Ono
Constitutional design for selecting the chief executive
Historical evolution since 1789
The Electoral College
How it works today
Implications and criticism
Alternatives
Reapportionment and post-2020 projections
Googleフォームによるアンケート調査で、複数の選択肢を選ぶことが可能な質問のデータを処理する方法。Transforming multiple response categories into a series of dummy variables using Google Form + Excel.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
2. In a nutshell…..
• Academic research on GitHub began to
take off in 2012.
• In 2017 and 2018 there will be over 1,000
scholarly publications on GitHub per year
(in title, keywords, abstract).
• Computer science, bioinformatics, and
mathematics dominate.
• USA, UK, China, Germany, Canada and
France are the leaders.
2
3. “GitHub” in article title, keyword or abstract.
English language sources only
3
As of May
21, there
were 501 in
2018.
10. “GitHub” in title produces 193 results (as of
May 2018)
• Among these, the very first one is…(has
been cited 277 times)
Dabbish, L., Stuart, C., Tsay, J., & Herbsleb,
J. (2012). Social coding in GitHub:
Transparency and collaboration in an open
software repository. In Proceedings of the
ACM Conference on Computer Supported
Cooperative Work, CSCW (pp. 1277–1286).
https://doi.org/10.1145/2145204.2145396
10
11. Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012).
11
“Social applications on the web let users track and follow the
activities of a large number of others regardless of location or
affiliation. There is a potential for this transparency to radically
improve collaboration and learning in complex knowledge-
based activities. Based on a series of in-depth interviews with
central and peripheral GitHub users, we examined the value
of transparency for large-scale distributed collaborations and
communities of practice. We find that people make a
surprisingly rich set of social inferences from the networked
activity information in GitHub, such as inferring someone
else's technical goals and vision when they edit code, or
guessing which of several similar projects has the best chance
of thriving in the long term. Users combine these inferences
into effective strategies for coordinating work, advancing
technical skills and managing their reputation.”
12. “GitHub” in title produces 193 results.
Five most cited (as of May 2018)
• Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012).
• Kalliamvakou, E., Singer, L., Gousios, G., German, D. M., Blincoe, K., &
Damian, D. (2014). The promises and perils of mining GitHub. In 11th
Working Conference on Mining Software Repositories, MSR 2014 -
Proceedings (pp. 92–101). https://doi.org/10.1145/2597073.2597074
• Tsay, J., Dabbish, L., & Herbsleb, J. (2014). Influence of social and
technical factors for evaluating contribution in GitHub. In Proceedings -
International Conference on Software Engineering (pp. 356–366).
https://doi.org/10.1145/2568225.2568315
• Vasilescu, B., Filkov, V., & Serebrenik, A. (2013). StackOverflow and
GitHub: Associations between software development and crowdsourced
knowledge. In Proceedings -
SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013 (pp. 188–
195). https://doi.org/10.1109/SocialCom.2013.35
• Gousios, G., & Spinellis, D. (2012). GHTorrent: Github’s data from a
firehose. In IEEE International Working Conference on Mining Software
Repositories (pp. 12–21). https://doi.org/10.1109/MSR.2012.6224294
12
13. “GitHub” in title. The latest publications (2018).
• Liao, Z., Jin, H., Li, Y., Zhao, B., Wu, J., & Shengzong, L. (2018). DevRank: Mining
influential developers in Github. In 2017 IEEE Global Communications Conference,
GLOBECOM 2017 - Proceedings (Vol. 2018–Janua, pp. 1–6).
https://doi.org/10.1109/GLOCOM.2017.8255005
• Treude, C., Leite, L., & Aniche, M. (2018). Unusual events in GitHub repositories. Journal
of Systems and Software, 142, 237–247. https://doi.org/10.1016/j.jss.2018.04.063
• Liao, Z., Dayu, H., Chen, Z., Fan, X., Zhang, Y., & Liu, S. (2018). Exploring the
Characteristics of Issue-related Behaviors in GitHub Using Visualization Techniques.
IEEE Access. https://doi.org/10.1109/ACCESS.2018.2810295
• Hu, Y., Wang, S., Ren, Y., & Choo, K.-K. R. (2018). User influence analysis for Github
developer social networks. Expert Systems with Applications, 108, 108–118.
https://doi.org/10.1016/j.eswa.2018.05.002
• Sun, X., Xu, W., Xia, X., Chen, X., & Li, B. (2018). Personalized project recommendation
on GitHub. Science China Information Sciences, 61(5). https://doi.org/10.1007/s11432-
017-9419-x
• Luo, Q., Moran, K., Zhang, L., & Poshyvanyk, D. (2018). How Do Static and Dynamic Test
Case Prioritization Techniques Perform on Modern Software Systems? An Extensive
Study on GitHub Projects. IEEE Transactions on Software Engineering.
https://doi.org/10.1109/TSE.2018.2822270
• Singh, N., & Singh, P. (2018). How Do Code Refactoring Activities Impact Software
Developers’ Sentiments? - An Empirical Investigation into GitHub Commits. In
Proceedings - Asia-Pacific Software Engineering Conference, APSEC (Vol. 2017–Decem,
pp. 648–653). https://doi.org/10.1109/APSEC.2017.79
• Wu, Y., Kropczynski, J., Prates, R., & Carroll, J. M. (2018). Understanding how GitHub
supports curation repositories. Future Internet, 10(3). https://doi.org/10.3390/fi10030029
• Lu, Y., Mao, X., Li, Z., Zhang, Y., Wang, T., & Yin, G. (2018). Internal quality assurance for
external contributions in GitHub: An empirical investigation. Journal of Software: Evolution
and Process, 30(4). https://doi.org/10.1002/smr.1918 13