Presentation at the IEEE International Conference on Mining Software Repositories (MSR 2023) by Natarajan Chidambaram (Software Engineering Lab, University of Mons, Belgium) of a dataset of bot and human activities extracted from GitHub
Recognising bot activity in collaborative software developmentTom Mens
Presentation by Natarajan Chidambaram during the International ICSE Workshop on Bots in Software Engineering (BotSE 2023) in Australia. Joint work with Mehdi Golzadeh, Tom Mens, Alexandre Decan of the Software Engineering Lab of the University of Mons and with Eleni Constantinou.
This presentation was given at the International Seminar on Advanced Techniques & Tools for Software Evolution (SATTOSE) 2023. Link to accepted paper - https://ceur-ws.org/Vol-3483/paper3.pdf
Jisc is a UK organization that aims to advance digital technology in education and research. Their Learning Analytics project has three core parts: a learning analytics service, toolkit, and community. The service provides dashboards and tools to analyze student data from various sources to identify at-risk students and enable interventions. It follows an open architecture approach. The toolkit includes guidance on best practices like privacy and consent. The community aspect involves events, blogs, and mailing lists to bring people together around learning analytics.
This document summarizes Peter Baeck's presentation on digital social innovation (DSI) at SI Live in Lisbon on November 13, 2014. The presentation defined DSI, discussed examples in four technological areas, and shared lessons learned from mapping over 900 European organizations involved in DSI. The key findings were that most DSI projects are driven by new types of social innovation organizations, there is a skills gap around digital technologies in the social sector, and most activity is currently small-scale but rapidly evolving.
This document provides an overview of the MOVING project, which aims to build an innovative training platform to improve information literacy. The platform will provide access to data sources, search and visualization methods, and individually tailored training programs to enable open leadership innovation. A consortium led by CERTH-ITI is developing the platform over 3 years with H2020 funding. Two use cases of the platform involve assisting public administrators with business research and helping researchers manage and mine research information. The platform will include modules for data acquisition, processing, video analysis, lecture linking, and demonstrations.
Visual Information Analysis for Crisis and Natural Disasters Management and R...Yiannis Kompatsiaris
Invited talk at the Ninth International Conference on Image Processing Theory, Tools and Applications IPTA 2019 (http://www.ipta-conference.com/ipta19/)
Crises and natural disasters are unwelcome, but also unavoidable features of modern society, affecting more communities than ever. Visual information analysis plays an important role in efficient pre-event (e.g. risk modeling), during the event (response) and post-event (recovery) emergency situation management. This talk will describe the potential role of visual information sources including satellite images, surveillance and traffic cameras, social multimedia and aerial video in applications such as floods, fires, and oil spills. Multimodal and fusion techniques will be presented combining satellite and social data and how deep neural networks can be applied in this domain. The talks will include demos and results from the relevant BeAware and EOPEN projects and from our participation in the 2018 Multimedia Satellite Task of the MediaEval Benchmarking Initiative.
Recognising bot activity in collaborative software developmentTom Mens
Presentation by Natarajan Chidambaram during the International ICSE Workshop on Bots in Software Engineering (BotSE 2023) in Australia. Joint work with Mehdi Golzadeh, Tom Mens, Alexandre Decan of the Software Engineering Lab of the University of Mons and with Eleni Constantinou.
This presentation was given at the International Seminar on Advanced Techniques & Tools for Software Evolution (SATTOSE) 2023. Link to accepted paper - https://ceur-ws.org/Vol-3483/paper3.pdf
Jisc is a UK organization that aims to advance digital technology in education and research. Their Learning Analytics project has three core parts: a learning analytics service, toolkit, and community. The service provides dashboards and tools to analyze student data from various sources to identify at-risk students and enable interventions. It follows an open architecture approach. The toolkit includes guidance on best practices like privacy and consent. The community aspect involves events, blogs, and mailing lists to bring people together around learning analytics.
This document summarizes Peter Baeck's presentation on digital social innovation (DSI) at SI Live in Lisbon on November 13, 2014. The presentation defined DSI, discussed examples in four technological areas, and shared lessons learned from mapping over 900 European organizations involved in DSI. The key findings were that most DSI projects are driven by new types of social innovation organizations, there is a skills gap around digital technologies in the social sector, and most activity is currently small-scale but rapidly evolving.
This document provides an overview of the MOVING project, which aims to build an innovative training platform to improve information literacy. The platform will provide access to data sources, search and visualization methods, and individually tailored training programs to enable open leadership innovation. A consortium led by CERTH-ITI is developing the platform over 3 years with H2020 funding. Two use cases of the platform involve assisting public administrators with business research and helping researchers manage and mine research information. The platform will include modules for data acquisition, processing, video analysis, lecture linking, and demonstrations.
Visual Information Analysis for Crisis and Natural Disasters Management and R...Yiannis Kompatsiaris
Invited talk at the Ninth International Conference on Image Processing Theory, Tools and Applications IPTA 2019 (http://www.ipta-conference.com/ipta19/)
Crises and natural disasters are unwelcome, but also unavoidable features of modern society, affecting more communities than ever. Visual information analysis plays an important role in efficient pre-event (e.g. risk modeling), during the event (response) and post-event (recovery) emergency situation management. This talk will describe the potential role of visual information sources including satellite images, surveillance and traffic cameras, social multimedia and aerial video in applications such as floods, fires, and oil spills. Multimodal and fusion techniques will be presented combining satellite and social data and how deep neural networks can be applied in this domain. The talks will include demos and results from the relevant BeAware and EOPEN projects and from our participation in the 2018 Multimedia Satellite Task of the MediaEval Benchmarking Initiative.
There’s a lot of hype right now about blockchain, the technology that underpins the Bitcoin virtual currency, with speculation that it could transform just about every aspect of our lives. In this webinar I’ll consider possible blockchain applications in research and education, and do a little myth-busting about when and where it makes sense to use blockchain.
Blockchain in research and education - UKSG Webinar - September 2017Martin Hamilton
There’s a lot of hype right now about blockchain, the technology that underpins the Bitcoin virtual currency, with speculation that it could transform just about every aspect of our lives. In this talk for UKSG I consider possible blockchain applications in research and education, and do a little myth-busting about when and where it makes sense to use blockchain.
The Citizen Cyberlab is an EU project that builds platforms, tools and pilot projects to support citizen science. It involves seven partners developing three platforms, four pilot projects and three tools to enable volunteers to participate in scientific research through tasks such as data collection and analysis. The goal is to study and foster creativity and learning in citizen science.
Teaching Machine Learning with Physical Computing - July 2023Hal Speed
This document provides an overview of resources for teaching machine learning and artificial intelligence concepts to K-12 students. It discusses machine learning concepts and workflows. It then lists and briefly describes various hardware platforms, software tools, curricula, and online resources that can be used to teach machine learning, including platforms for visual programming languages like Scratch and Blockly.
The role of individuals and communities in IoT Paola Negrin
Presented on May 10, 2016 by Luca Mari at the International conference “IoTnow Everything but hype” (Milan Disruptive week) in the session “Overview of IoT key issues, opportunities and threats”.
Abstract:
Differently from the industrial automation epitomised in the Computer Integrated Manufacturing of the ’70s and ’80s, IoT is a human-centric technology, in which the widespread adoption of open source and hardware tools lowers the barriers to entry and blurs the roles, toward scenarios of extreme customisation made by prosumers operating in informal, dynamic communities. With some reflections on this perspective, from the data and experiences obtained in an ongoing European research project on the “Digital Do It Yourself” phenomenon.
Presentation from Yannick Legré (EGI Managing Director) at ICTFOOTPRINT.eu Hands on Workshop Event “Green ICT – in practice” (20th March 2018 - Amsterdam)
Winter of Code 3.0 is a 3 month long open source event with over 2300 contributors working on 60+ open source projects across 20 mentor organizations. The event timeline includes a student application period from December 10th to January 15th, a proposal writing period from January 15th to 20th, a community bonding period from January 25th to March 10th where participants work on their projects, and a closing ceremony on March 25th where winners will be announced.
The document provides an agenda and highlights from a final project review meeting for the OPTIMIS project. It summarizes dissemination activities over the 3 years of the project, including 105 scientific publications, 158 conference presentations, and collaborations with 8 other EU projects. It discusses the release of version 3.0 of the OPTIMIS toolkit and analytics on the project and toolkit websites, which had over 5,000 visits and 19,000 page views. Finally, it outlines joint exploitation activities to promote the toolkit through the OpenNebula ecosystem and individual exploitation highlights from project partners, including a spin-off company, academic adoption, and potential commercial adoption.
Cyber Security Challenge Belgium - welcome to our belgian IT security communitySebastien Deleersnyder
A presentation to last-year students in Belgium to provide an overview of the Cybersecurity community in Belgium.
It also explains the value of volunteering and how to get involved personally.
The presentation is somewhat biased towards OWASP, as I started the chapter in Belgium, but also contains pointers to ISACA, L-SEC, ISC2, BruCON and many more...
If I missed an organisation or event - let me know!
Oscon 2016: open source lessons from the todo groupBen VanEvery
The document summarizes lessons learned from open source programs at several major tech companies presented at an event by the TODO Group. The TODO Group is a collaboration of companies who share practices for running successful open source programs. Several companies including Netflix, Microsoft, Capital One, Box, Sandisk, Google and Yahoo discussed how they scale their open source programs, build communities, and realize strategic benefits from their involvement in open source.
Technology is changing our lives but what about our homes?
This short interactive cinematic experience asks just how much does your living room know about you?
https://www.fact.co.uk/projects/the-living-room-of-the-future.aspx
This document discusses the Sci-GaIA project's approach to conjugating open science and open education through e-research hackfests. The project has developed an open science platform containing an open access repository and courseware system. It has also implemented several hackfests using a model where participants work to integrate use cases using open technologies over several days. These hackfests have helped promote open science, train champions, and develop systems like Ethiopia's open access repository and MOOC platform. The hackfest model provides an innovative way to foster open science and education through problem-solving and hands-on learning.
Blockchain in Digital Vienna - Technology of an innovative administrationStadt Wien
The document discusses the City of Vienna's exploration of blockchain technology through pilot projects. It outlines two initial pilots: one to securely timestamp and archive changes to open government data using blockchain notarization, and another to implement a digital food voucher system for city employees using blockchain. The city aims to learn from these pilots, establish Vienna as a blockchain hub, and find more opportunities to optimize administrative processes with the technology.
This document discusses a project called Digital Social Innovation that has three objectives: defining and understanding digital social innovation's potential, crowdmapping organizations working in the field, and developing policy recommendations to better support it. The project will map over 1,000 organizations across Europe involved in digital social innovation through open knowledge, open networks, open data and open hardware. It will analyze the network connections and identify strong and weak networks. The findings will feed into recommendations for the European Commission to better support this area. The project website is digitalsocial.eu, which aims to be a long-term resource for the digital social innovation community.
This document contains notes on visualization and digital humanities topics. It discusses visualization failures due to lack of communication, funding, and training. It also lists tools for data analysis, interactive maps and timelines, 3D modeling of heritage sites, and creating remixable games. Open data initiatives and using free government data for new applications are mentioned. Digital humanities centers are surveyed and questions about the field are posed. A variety of digital tools for humanities research are listed.
How the rise of DevOps and containers is transforming IT service deliveryDonnie Berkholz
One of the fastest-growing trends in technology is containers, enabled by a modern approach to software development and deployment called DevOps. This talk will delve into the increasingly mainstream trend of DevOps, the Docker and containers ecosystem including current enterprise adoption, and how they combine to form a new style of software architecture dubbed microservices. We'll close by looking at real-world case studies at leading companies.
Keynote talk targeted to PhD students, during the BENEVOL 2023 research seminar (focused on software evolution) in Nijmegen, 27 November 2023, by Tom Mens (full professor in software engineering at University of Mons, Belgium). The keynote aims to provide tips, tricks and practical advice on how to become successful as a PhD student.
This document discusses the rise of GitHub Actions (GHA) as a dominant continuous integration (CI) service based on a longitudinal study of 91,810 GitHub repositories. The study analyzed the evolution and usage of seven popular CI services over nine years, focusing on their co-usage and migration patterns. The study provides statistical evidence that GHA became the most used CI service within 18 months of its introduction, coinciding with a decrease in Travis usage likely due to policy changes and migrations to GHA. Interviews with software practitioners revealed competition between services and reasons for co-using or migrating between alternatives.
More Related Content
Similar to A Dataset of Bot and Human Activities in GitHub
There’s a lot of hype right now about blockchain, the technology that underpins the Bitcoin virtual currency, with speculation that it could transform just about every aspect of our lives. In this webinar I’ll consider possible blockchain applications in research and education, and do a little myth-busting about when and where it makes sense to use blockchain.
Blockchain in research and education - UKSG Webinar - September 2017Martin Hamilton
There’s a lot of hype right now about blockchain, the technology that underpins the Bitcoin virtual currency, with speculation that it could transform just about every aspect of our lives. In this talk for UKSG I consider possible blockchain applications in research and education, and do a little myth-busting about when and where it makes sense to use blockchain.
The Citizen Cyberlab is an EU project that builds platforms, tools and pilot projects to support citizen science. It involves seven partners developing three platforms, four pilot projects and three tools to enable volunteers to participate in scientific research through tasks such as data collection and analysis. The goal is to study and foster creativity and learning in citizen science.
Teaching Machine Learning with Physical Computing - July 2023Hal Speed
This document provides an overview of resources for teaching machine learning and artificial intelligence concepts to K-12 students. It discusses machine learning concepts and workflows. It then lists and briefly describes various hardware platforms, software tools, curricula, and online resources that can be used to teach machine learning, including platforms for visual programming languages like Scratch and Blockly.
The role of individuals and communities in IoT Paola Negrin
Presented on May 10, 2016 by Luca Mari at the International conference “IoTnow Everything but hype” (Milan Disruptive week) in the session “Overview of IoT key issues, opportunities and threats”.
Abstract:
Differently from the industrial automation epitomised in the Computer Integrated Manufacturing of the ’70s and ’80s, IoT is a human-centric technology, in which the widespread adoption of open source and hardware tools lowers the barriers to entry and blurs the roles, toward scenarios of extreme customisation made by prosumers operating in informal, dynamic communities. With some reflections on this perspective, from the data and experiences obtained in an ongoing European research project on the “Digital Do It Yourself” phenomenon.
Presentation from Yannick Legré (EGI Managing Director) at ICTFOOTPRINT.eu Hands on Workshop Event “Green ICT – in practice” (20th March 2018 - Amsterdam)
Winter of Code 3.0 is a 3 month long open source event with over 2300 contributors working on 60+ open source projects across 20 mentor organizations. The event timeline includes a student application period from December 10th to January 15th, a proposal writing period from January 15th to 20th, a community bonding period from January 25th to March 10th where participants work on their projects, and a closing ceremony on March 25th where winners will be announced.
The document provides an agenda and highlights from a final project review meeting for the OPTIMIS project. It summarizes dissemination activities over the 3 years of the project, including 105 scientific publications, 158 conference presentations, and collaborations with 8 other EU projects. It discusses the release of version 3.0 of the OPTIMIS toolkit and analytics on the project and toolkit websites, which had over 5,000 visits and 19,000 page views. Finally, it outlines joint exploitation activities to promote the toolkit through the OpenNebula ecosystem and individual exploitation highlights from project partners, including a spin-off company, academic adoption, and potential commercial adoption.
Cyber Security Challenge Belgium - welcome to our belgian IT security communitySebastien Deleersnyder
A presentation to last-year students in Belgium to provide an overview of the Cybersecurity community in Belgium.
It also explains the value of volunteering and how to get involved personally.
The presentation is somewhat biased towards OWASP, as I started the chapter in Belgium, but also contains pointers to ISACA, L-SEC, ISC2, BruCON and many more...
If I missed an organisation or event - let me know!
Oscon 2016: open source lessons from the todo groupBen VanEvery
The document summarizes lessons learned from open source programs at several major tech companies presented at an event by the TODO Group. The TODO Group is a collaboration of companies who share practices for running successful open source programs. Several companies including Netflix, Microsoft, Capital One, Box, Sandisk, Google and Yahoo discussed how they scale their open source programs, build communities, and realize strategic benefits from their involvement in open source.
Technology is changing our lives but what about our homes?
This short interactive cinematic experience asks just how much does your living room know about you?
https://www.fact.co.uk/projects/the-living-room-of-the-future.aspx
This document discusses the Sci-GaIA project's approach to conjugating open science and open education through e-research hackfests. The project has developed an open science platform containing an open access repository and courseware system. It has also implemented several hackfests using a model where participants work to integrate use cases using open technologies over several days. These hackfests have helped promote open science, train champions, and develop systems like Ethiopia's open access repository and MOOC platform. The hackfest model provides an innovative way to foster open science and education through problem-solving and hands-on learning.
Blockchain in Digital Vienna - Technology of an innovative administrationStadt Wien
The document discusses the City of Vienna's exploration of blockchain technology through pilot projects. It outlines two initial pilots: one to securely timestamp and archive changes to open government data using blockchain notarization, and another to implement a digital food voucher system for city employees using blockchain. The city aims to learn from these pilots, establish Vienna as a blockchain hub, and find more opportunities to optimize administrative processes with the technology.
This document discusses a project called Digital Social Innovation that has three objectives: defining and understanding digital social innovation's potential, crowdmapping organizations working in the field, and developing policy recommendations to better support it. The project will map over 1,000 organizations across Europe involved in digital social innovation through open knowledge, open networks, open data and open hardware. It will analyze the network connections and identify strong and weak networks. The findings will feed into recommendations for the European Commission to better support this area. The project website is digitalsocial.eu, which aims to be a long-term resource for the digital social innovation community.
This document contains notes on visualization and digital humanities topics. It discusses visualization failures due to lack of communication, funding, and training. It also lists tools for data analysis, interactive maps and timelines, 3D modeling of heritage sites, and creating remixable games. Open data initiatives and using free government data for new applications are mentioned. Digital humanities centers are surveyed and questions about the field are posed. A variety of digital tools for humanities research are listed.
How the rise of DevOps and containers is transforming IT service deliveryDonnie Berkholz
One of the fastest-growing trends in technology is containers, enabled by a modern approach to software development and deployment called DevOps. This talk will delve into the increasingly mainstream trend of DevOps, the Docker and containers ecosystem including current enterprise adoption, and how they combine to form a new style of software architecture dubbed microservices. We'll close by looking at real-world case studies at leading companies.
Similar to A Dataset of Bot and Human Activities in GitHub (20)
Keynote talk targeted to PhD students, during the BENEVOL 2023 research seminar (focused on software evolution) in Nijmegen, 27 November 2023, by Tom Mens (full professor in software engineering at University of Mons, Belgium). The keynote aims to provide tips, tricks and practical advice on how to become successful as a PhD student.
This document discusses the rise of GitHub Actions (GHA) as a dominant continuous integration (CI) service based on a longitudinal study of 91,810 GitHub repositories. The study analyzed the evolution and usage of seven popular CI services over nine years, focusing on their co-usage and migration patterns. The study provides statistical evidence that GHA became the most used CI service within 18 months of its introduction, coinciding with a decrease in Travis usage likely due to policy changes and migrations to GHA. Interviews with software practitioners revealed competition between services and reasons for co-using or migrating between alternatives.
Nurturing the Software Ecosystems of the FutureTom Mens
In January 2018, four Software Engineering research groups located in different Belgian Universities launched a five year research project to nurture the software ecosystems of the future. We assembled a diverse team of about a dozen researchers and embarked on an exciting journey leading to a rich and diverse suite of papers, tools and datasets. Halfway into the project the corona pandemic intervened, but despite several months of lockdown, we succeeded in increasing inter-university collaboration. In this paper we share our achievements so that the BENEVOL community may benefit from our experience.
Comment programmer un robot en 30 minutes?Tom Mens
Comment apprendre à programmer un robot en 30 minutes? Atelier organisé par Tom Mens (en collaboration avec Pierre Zielinski, Gauvain Devillez et Sebastien Bonte) lors des Journées Math-Sciences du Printemps des Sciences 2022 à l'Université de Mons
On the rise and fall of CI services in GitHubTom Mens
Presentation of SANER 2022 conference article "On the rise and fall of CI services in GitHub" by Mehdi Golzadeh (co-authored with Alexandre Decan and Tom Mens).
On backporting practices in package dependency networksTom Mens
Presentation at FOSDEM 2022 Composition and Dependency Management DevRoom of empirical research on backporting practices in package dependency networks, published in the IEEE Transactions in Software Engineering in 2021 (https://doi.org/10.1109/TSE.2021.3112204)
Joint work by Alexandre Decan, Tom Mens; Ahmed Zeourali, Coen De Roover as part of the Belgian Excellence of Science research project SECOASSIST (https://secoassist.github.io)
Comparing semantic versioning practices in Cargo, npm, Packagist and RubygemsTom Mens
Presentation by Tom Mens at PackagingCon 2021 on Wednesday 10 November 2021.
Abstract: Semantic versioning (semver) is a commonly accepted open source practice, used by many package management systems to inform whether new package releases introduce possibly backward incompatible changes. Maintainers depending on such packages can use this practice to reduce the risk of breaking changes in their own packages by specifying version constraints on their dependencies. Depending on the amount of control a package maintainer desires to assert over her package dependencies, these constraints can range from very permissive to very restrictive. We empirically compared the evolution of semver compliance in four package management systems: Cargo, npm, Packagist and Rubygems. We discuss to what extent ecosystem-specific characteristics influence the degree of semver compliance, and we suggest to develop tools adopting the wisdom of the crowds to help package maintainers decide which type of version constraints they should impose on their dependencies.
We also studied to which extent the packages distributed by these package managers are still using a 0.y.z release, suggesting less stable and immature packages. We explore the effect of such "major zero" packages on semantic versioning adoption.
Our findings shed insight in some important differences between package managers with respect to package versioning policies.
Our empirical results have been published in two peer-reviewed academic journals: the IEEE Transactions in Software Engineering (https://doi.org/10.1109/TSE.2019.2918315) and Elsevier Science of Computer Programming (https://doi.org/10.1016/j.scico.2021.102656).
Achknowledgments: Research conducted in the context of the SECOASSIST "Excellence of Science" Research Project.
Presentation by Tom Mens at FOSDEM21 (Free Open Source Developers Meeting, February 2021). Published in Science of Computer Programming, August 2021.
https://doi.org/10.1016/j.scico.2021.102656
Abstract: When developing open source software end-user applications or reusable software packages, developers depend on software packages distributed through package managers such as npm, Packagist, Cargo, RubyGems. In addition to this, empirical evidence has shown that these package managers adhere to a large extent to semantic versioning principles. Packages that are still in major version zero are considered unstable according to semantic versioning, as some developers consider such packages as immature, still being under initial development.
This presentation reports on large-scale empirical evidence on the use of dependencies towards 0.y.z versions in four different software package distributions: Cargo, npm, Packagist and RubyGems. We study to which extent packages get stuck in the zero version space, never crossing the psychological barrier of major version zero. We compare the effect of the policies and practices of package managers on this phenomenon. We do not reveal the results of our findings in this abstract yet, as it would spoil the fun of the presentation.
Evaluating a bot detection model on git commit messagesTom Mens
Detecting the presence of bots in distributed software development activity is very important in order to prevent bias in socio-technical empirical studies. In previous work, we proposed a classification model to detect bots in GitHub repositories based on the pull request and issue comments of GitHub accounts. The current study generalises the approach to git contributors based on their commit messages. We train and evaluate the classification model on a large dataset of 6,922 git contributors. The original model based on pull request and issue comments obtained a precision of 0.77 on this dataset, whereas retraining the classification model on git commit messages increased the precision to 0.80. As a proof-of-concept, we implemented this model in BoDeGiC, an open source command-line tool to detect bots in git repositories.
Is my software ecosystem healthy? It depends!Tom Mens
QUATIC 2020 keynote presentation by Tom Mens (University of Mons) on dependency-related health issues in software ecosystems and research advances to address such health issues. Part of the presented research has been conducted as part of the Belgian SECO-ASSIST Excellence of Science Research Project.
Bot or not? Detecting bots in GitHub pull request activity based on comment s...Tom Mens
Presentation by Mehdi Golzadeh (Software Engineering Lab, University of Mons) of an article published at the 2nd International ICSE Workshop on Bots In Software Engineering (BotSE). See https://doi.org/10.1145/3387940.3391503
Abstract: Many empirical studies focus on socio-technical activity in social coding platforms such as GitHub, for example to study the onboarding, abandonment, productivity and collaboration among team members. Such studies face the difficulty that GitHub activity can also be generated automatically by bots of a different nature. It therefore becomes imperative to distinguish such bots from human users. We propose an automated approach to detect bots in GitHub pull request activity. Relying on the assumption that bots contain repetitive message patterns in their pull request comments, we analyse the similarity between multiple messages from the same GitHub identity, using a clustering method that combines the Jaccard and Levenshtein distance. We empirically evaluate our approach by analysing 20,090 comments of 250 users and 42 bots in 1,262 GitHub repositories. Our results show that the method is able to clearly separate bots from human users.
How magic is zero? An Empirical Analysis of Initial Development Releases in S...Tom Mens
1. 0.y.z packages are highly prevalent, contributing to 90% of packages in some distributions even though documentation states they are for initial development.
2. It generally takes a few months for packages to reach ≥1.0.0 but 20% take over a year, suggesting packages get stuck in 0.y.z.
3. 0.y.z packages are updated slightly more frequently but the difference is negligible, and there is little practical difference in how 0.y.z and ≥1.0.0 packages are used.
Comparing dependency issues across software package distributions (FOSDEM 2020)Tom Mens
This talk reports on our findings based on multiple empirical studies that we have conducted to understand different aspects of dependency management and their practical implications. This includes:
* the outdatedness of package dependencies, the transitive impact of such "technical lag", and its relation to the presence of bugs and security vulnerabilities.
* the impact of using either more permissive or more restrictive version contraints on dependencies.
* the virtues and limitations of being compliant to semantic versioning, a common policy to inform dependents whether new releases of software packages introduce possibly backward incompatible changes.
* the impact of specific characteristics, policies and tools used by the packaging ecosystem and its supporting community on all of the above.
The contents of the talk is primarily based on the following peer-reviewed scientific articles:
* What do package dependencies tell us about semantic versioning? Alexandre Decan, Tom Mens. IEEE Transactions on Software Engineering, 2019. https://doi.org/10.1109/TSE.2019.2918315
* An empirical comparison of dependency network evolution in seven software packaging ecosystems. Alexandre Decan, Tom Mens, Philippe Grosjean. Empirical Software Engineering 24(1):381-416, 2019. https://doi.org/10.1007/s10664-017-9589-y
* A formal framework for measuring technical lag in component repositories and its application to npm. Ahmed Zerouali, Tom Mens, Jesus Gonzalez‐Barahona, Alexandre Decan, Eleni Constantinou, Gregorio Robles. Journal of Software: Evolution and Process 31(8), 2019. https://doi.org/10.1002/smr.2157
* On the Impact of Security Vulnerabilities in the npm Package Dependency Network. Alexandre Decan, Tom Mens, Eleni Constantinou. International Conference on Mining Software Repositories, 2018. https://doi.org/10.1145/3196398.3196401
* On the Evolution of Technical Lag in the npm Package Dependency Network. Alexandre Decan, Tom Mens, Eleni Constantinou. International Conference on Software Maintenance and Evolution, 2018. https://doi.org/10.1109/ICSME.2018.00050
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
Presentation at CHAOSSCon Europe 2020 about the generic technical lag software measurement framework. Technical lag measures the increasing difference between deployed software components and the ideal upstream software components.
For more information, see https://doi.org/10.1002/smr.2157
This presentation reports on the research results achieved in the context of the interuniversity interdisciplinary research project SECOHealth "Vers une méthodologie et analyse socio-technique interdisciplinaire de la santé des écosystèmes logiciels" co-financed by FRS-FNRS Belgium and FRQ (FRSC - FRNT, Québec) with principal investigators Tom Mens (UMONS), Bram Adams (Polytechnique Montréal) and Josianne Marsan (Université Laval).
Introduction to the research seminar on empirical analysis of open source software ecosystems, organised by the SECO-ASSIST "excellence of science" research project, on September 4th, 2019 at the University of Mons, Belgium. With invited presentations by Alexander Serebrenik, Jesus Gonzalez-Barahona, Dario Di Nucci and Henrique Nucci. The seminar concludes with the public PhD defense of Ahmed Zerouali (supervised by Tom Mens) on the topic of "A Measurement Framework for Analyzing Technical Lag in Open-Source Software Ecosystems"
Empirically Analysing the Socio-Technical Health of Software Package ManagersTom Mens
Invited presentation at Concordia University (Montreal, Canada) by Eleni Constantinou and Tom Mens on recent research about the socio-technical health issues in software package management ecosystems.
Abstract: The large majority of today’s software is relying on open software software components. Such components are typically distributed through package managers for a wide variety of programming languages, and developed and maintained through online distributed software development services like GitHub. Software component repositories are perceived as software ecosystems that constitute complex and evolving socio-technical software dependency networks. Because of their complexity and evolution, these ecosystems tend to suffer from a wide variety of software health issues that can be either technical or social in nature. Examples of such issues include the ecosystem fragility due to exponential growth and transitive dependencies; the abundance of outdated, unmaintained or obsolete software components; the prolonged presence of unfixed bugs and security vulnerabilities; the abandonment or high turnover of key contributors, suboptimal collaboration between contributors, and many more. This presentation will report on our past and ongoing empirical research that studies such health factors within and across different software packaging ecosystems (such as npm, RubyGems, Cargo, CRAN, CPAN). We provide empirical evidence of some of the health problems, compare their presence across different ecosystems, and suggest ways to reduce their potential impact by providing concrete guidelines and tools. The presented research Is being conducted by researchers of the Software Engineering Lab at the University of Mons in the context of two ongoing projects SECOHealth and SECO-ASSIST, aiming to analyse and improve the health of software ecosystems.
ConPan: Analysing Packages Installed in Docker ContainersTom Mens
ConPan is a tool that analyzes software packages installed in Docker containers to identify outdated and vulnerable packages. It combines information about outdatedness and known security vulnerabilities. ConPan works by scanning Docker images and comparing package information to vulnerability databases. The goal is to help identify security risks from outdated and vulnerable packages in container images to improve container security.
On the Relation between Outdated Docker Containers, Severity Vulnerabilities,...Tom Mens
Presentation by Tom Mens of SANER 2019 paper that was awarded as best paper. The topic concerns Docker containers, and more in particular the relation between outdated packages, technical lag, security vulnerabilities and bugs.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
1. A Dataset of Bot and Human Activities in GitHub
Natarajan Chidambaram, Alexandre Decan, Tom Mens
Software Engineering Lab, University of Mons, Belgium
Supported by Service public de Wallonie – Recherche under grant n°2010235 “ARIAC BY DIGITALWALLONIA4.AI”and
Fonds de la Recherche Scientifique – FNRS under grant numbers F.4515.23, O.0157.18F-RG43 and T.0149.22
SECO-ASSIST
https://zenodo.org/record/7740521
3. GitHub Events API
can retrieve the latest 300 events
in the last 90 days
Closing issue
Opening issue
Reopening issue
branch
repository
tag
Creating tag
Creating branch
Creating repository
IssuesEvent
IssueCommentEvent Closing issue
created
closed
Reopening issue
reopened
created
CreateEvent
Opening issue
opened
4. # contributors # activities
Bot dataset 385 649,755
Human dataset 616 184,056
total 1,001 833,811
• 834K activities obtained from 1M+ events
• 24 activity types
• 1K contributors
• 105 days (25 Nov 2022 to 9 Mar 2023)
{
"date": "2022-11-26T14:13:19+00:00",
"activity": "Commenting issue",
"contributor": "kubevirt-bot",
"repository": "kubevirt/kubevirt",
"comment": {
"length": 255,
"GH_node": "IC_kwDOBJIk985PKH4s"
},
"issue": {
"id": 8294,
"title": "SRIOV VF interface not found in VM",
"created_at": "2022-08-13T11:10:06+00:00",
"status": "open",
"closed_at": null,
"resolved": false,
"GH_node": "I_kwDOBJIk985Pvz5k"
}
"conversation": {
"comments": 9
}
}
JSON format
5. Usefulness of the Dataset
• Analyse most frequent activities
• Find frequent patterns in activities
• Find behavioural differences between bots and humans
• Forecast future contributor activities
• Detect which tasks are made by which bots
• Classify contributors based on activities
• Develop a new bot detection technique
bot human
6. Some Distinguishing Features
Dispersion of activity types across
repositories
Time to shift between repositories
Number of activity types
Variation in activity frequency
7. Remove no
longer existing
contributors
Dataset Construction Process
Golzadeh
et al.
Abdellatif
et al.
Wang
et al.
Chidambaram
et al.
Combine all
contributors
Set of
contributors
Get
contributor
events
GitHub
event
stream
Missed
event?
Curating contributors
Drop
contributor
events
Yes
Contributor
events
no
Querying events
every 6 hours
Identify
contributor
activities
Human
activity
Anonymise
activities
Yes
Generating activities
Human
activities
Bot
activities
no