Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction.
In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively.
Tracing Software Build Processes to Uncover License Compliance InconsistenciesShane McIntosh
Open Source Software (OSS) components form the basis for many software systems. While the use of OSS components accelerates development, client systems must comply with the license terms of the OSS components that they use. Failure to do so exposes client system distributors to possible litigation from copyright holders. Yet despite the importance of license compliance, tool support for license compliance assessment is lacking. In this paper, we propose an approach to extract and analyze the Concrete Build Dependency Graph (CBDG) of a software system by tracing system calls that occur at build-time. Through a case study of seven open source systems, we show that the extracted CBDGs: (1) accurately classify sources as included in or excluded from deliverables with 88%-100% precision and 98%-100% recall, and (2) can uncover license compliance inconsistencies in real software systems - two of which prompted code fixes in the CUPS and FFmpeg systems.
The document discusses build system maintenance. It notes that build code is complex and requires 12% of a developer's time on average. Build bugs can affect end users. Four dimensions of build maintenance are discussed: size, evolution, coupling, and people involvement. Studies found that build churn is greater than source code churn, and that build maintenance is often dispersed across many team members rather than concentrated in a small team. Tool support is needed to help with build maintenance tasks.
Mining Co-Change Information to Understand when Build Changes are NecessaryShane McIntosh
As a software project ages, its source code is modified to add new features, restructure existing ones, and fix defects. These source code changes often induce changes in the build system, i.e., the system that specifies how source code is translated into deliverables. However, since developers are often not familiar with the complex and occasionally archaic technologies used to specify build systems, they may not be able to identify when their source code changes require accompanying build system changes. This can cause build breakages that slow development progress and impact other developers, testers, or even users. In this paper, we mine the source and test code changes that required accompanying build changes in order to better understand this co-change relationship. We build random forest classifiers using language-agnostic and language-specific code change characteristics to explain when code-accompanying build changes are necessary based on historical trends. Case studies of the Mozilla C++ system, the Lucene and Eclipse open source Java systems, and the IBM Jazz proprietary Java system indicate that our classifiers can accurately explain when build co-changes are necessary with an AUC of 0.60-0.88. Unsurprisingly, our highly accurate C++ classifiers (AUC of 0.88) derive much of their explanatory power from indicators of structural change (e.g., was a new source file added?). On the other hand, our Java classifiers are less accurate (AUC of 0.60-0.78) because roughly 75% of Java build co-changes do not coincide with changes to the structure of a system, but rather are instigated by concerns related to release engineering, quality assurance, and general build maintenance.
130426 yujuan jiang - will my patch make it and how fastPtidej Team
The document discusses the Linux kernel patch submission and integration process. It begins by outlining three research questions: how many patches are merged, what type of patch is more likely to be merged, and what type of patch is accepted faster. It then describes the setup of a case study to analyze patch data, including information on patch emails, commits, and metrics. Charts are presented showing the percentage of accepted vs rejected patches each year and that reviewing time has sped up while integration time has slowed down. Decision trees are discussed for classifying patch outcomes.
The document discusses the evolution of ANT build systems over time. It examines two research questions: (1) Do source code and ANT build systems co-evolve? The analysis found that growth and reduction periods in build system size often coincide with source code, suggesting they co-evolve. (2) Does perceived build complexity evolve? Build length and depth were found to increase over time, indicating growing complexity. Other findings include correlations between build size and complexity metrics. Threats to validity and conclusions are also presented.
This document summarizes the results of a large-scale empirical study on the relationship between build systems and build maintenance activity. The study analyzed over 800,000 open source projects to compare how different build technologies (e.g. Make, Autotools, Maven) affect build churn, source code coupling, and authorship over time. The key findings are that framework-based build systems tend to have higher build churn, tighter source code coupling, and decreasing authorship as projects migrate to more advanced technologies.
This document discusses Unicode and its importance for character sets like Arabic, Sinhala, Tamil, and Chinese. It covers Unicode encoding and font faces, explaining that Unicode fonts allow all languages to be displayed together while non-Unicode fonts only support one language. The document also notes some implications of using Unicode, such as improved internationalization and interoperability across systems.
Build systems orchestrate how human-readable source code is translated into executable programs. In a software project, source code changes can induce changes in the build system (aka. build co-changes). It is difficult for developers to identify when build co-changes are necessary due to the complexity of build systems. Prediction of build co-changes works well if there is a sufficient amount of training data to build a model. However, in practice, for new projects, there exists a limited number of changes. Using training data from other projects to predict the build co-changes in a new project can help improve the performance of the build co-change prediction. We refer to this problem as cross-project build co-change prediction.
In this paper, we propose CroBuild, a novel cross-project build co-change prediction approach that iteratively learns new classifiers. CroBuild constructs an ensemble of classifiers by iteratively building classifiers and assigning them weights according to its prediction error rate. Given that only a small proportion of code changes are build co-changing, we also propose an imbalance-aware approach that learns a threshold boundary between those code changes that are build co-changing and those that are not in order to construct classifiers in each iteration. To examine the benefits of CroBuild, we perform experiments on 4 large datasets including Mozilla, Eclipse-core, Lucene, and Jazz, comprising a total of 50,884 changes. On average, across the 4 datasets, CroBuild achieves a F1-score of up to 0.408. We also compare CroBuild with other approaches such as a basic model, AdaBoost proposed by Freund et al., and TrAdaBoost proposed by Dai et al.. On average, across the 4 datasets, the CroBuild approach yields an improvement in F1-scores of 41.54%, 36.63%, and 36.97% over the basic model, AdaBoost, and TrAdaBoost, respectively.
Tracing Software Build Processes to Uncover License Compliance InconsistenciesShane McIntosh
Open Source Software (OSS) components form the basis for many software systems. While the use of OSS components accelerates development, client systems must comply with the license terms of the OSS components that they use. Failure to do so exposes client system distributors to possible litigation from copyright holders. Yet despite the importance of license compliance, tool support for license compliance assessment is lacking. In this paper, we propose an approach to extract and analyze the Concrete Build Dependency Graph (CBDG) of a software system by tracing system calls that occur at build-time. Through a case study of seven open source systems, we show that the extracted CBDGs: (1) accurately classify sources as included in or excluded from deliverables with 88%-100% precision and 98%-100% recall, and (2) can uncover license compliance inconsistencies in real software systems - two of which prompted code fixes in the CUPS and FFmpeg systems.
The document discusses build system maintenance. It notes that build code is complex and requires 12% of a developer's time on average. Build bugs can affect end users. Four dimensions of build maintenance are discussed: size, evolution, coupling, and people involvement. Studies found that build churn is greater than source code churn, and that build maintenance is often dispersed across many team members rather than concentrated in a small team. Tool support is needed to help with build maintenance tasks.
Mining Co-Change Information to Understand when Build Changes are NecessaryShane McIntosh
As a software project ages, its source code is modified to add new features, restructure existing ones, and fix defects. These source code changes often induce changes in the build system, i.e., the system that specifies how source code is translated into deliverables. However, since developers are often not familiar with the complex and occasionally archaic technologies used to specify build systems, they may not be able to identify when their source code changes require accompanying build system changes. This can cause build breakages that slow development progress and impact other developers, testers, or even users. In this paper, we mine the source and test code changes that required accompanying build changes in order to better understand this co-change relationship. We build random forest classifiers using language-agnostic and language-specific code change characteristics to explain when code-accompanying build changes are necessary based on historical trends. Case studies of the Mozilla C++ system, the Lucene and Eclipse open source Java systems, and the IBM Jazz proprietary Java system indicate that our classifiers can accurately explain when build co-changes are necessary with an AUC of 0.60-0.88. Unsurprisingly, our highly accurate C++ classifiers (AUC of 0.88) derive much of their explanatory power from indicators of structural change (e.g., was a new source file added?). On the other hand, our Java classifiers are less accurate (AUC of 0.60-0.78) because roughly 75% of Java build co-changes do not coincide with changes to the structure of a system, but rather are instigated by concerns related to release engineering, quality assurance, and general build maintenance.
130426 yujuan jiang - will my patch make it and how fastPtidej Team
The document discusses the Linux kernel patch submission and integration process. It begins by outlining three research questions: how many patches are merged, what type of patch is more likely to be merged, and what type of patch is accepted faster. It then describes the setup of a case study to analyze patch data, including information on patch emails, commits, and metrics. Charts are presented showing the percentage of accepted vs rejected patches each year and that reviewing time has sped up while integration time has slowed down. Decision trees are discussed for classifying patch outcomes.
The document discusses the evolution of ANT build systems over time. It examines two research questions: (1) Do source code and ANT build systems co-evolve? The analysis found that growth and reduction periods in build system size often coincide with source code, suggesting they co-evolve. (2) Does perceived build complexity evolve? Build length and depth were found to increase over time, indicating growing complexity. Other findings include correlations between build size and complexity metrics. Threats to validity and conclusions are also presented.
This document summarizes the results of a large-scale empirical study on the relationship between build systems and build maintenance activity. The study analyzed over 800,000 open source projects to compare how different build technologies (e.g. Make, Autotools, Maven) affect build churn, source code coupling, and authorship over time. The key findings are that framework-based build systems tend to have higher build churn, tighter source code coupling, and decreasing authorship as projects migrate to more advanced technologies.
This document discusses Unicode and its importance for character sets like Arabic, Sinhala, Tamil, and Chinese. It covers Unicode encoding and font faces, explaining that Unicode fonts allow all languages to be displayed together while non-Unicode fonts only support one language. The document also notes some implications of using Unicode, such as improved internationalization and interoperability across systems.
SOCIAL METRICS INCLUDED IN PREDICTION MODELS ON SOFTWARE ENGINEERING: A MAPPI...Igor Wiese
This document discusses a mapping study on the use of social metrics in software engineering prediction models. It aims to identify which social metrics have been used and whether they had a positive effect. The study found that social metrics were often classified under other dimensions and there was inconsistent terminology. It identified papers reporting on various social metrics and grouped them into categories and sub-categories. The results showed that most papers reported social metrics had a positive effect in prediction models, while some reported negative or neutral effects. The conclusions note more research is needed on social metrics in different contexts and using larger datasets.
Collecting and Leveraging a Benchmark of Build System Clones to Aid in Qualit...Shane McIntosh
Build systems specify how sources are transformed into deliverables, and hence must be carefully maintained to ensure that deliverables are assembled correctly. Similar to source code, build systems tend to grow in complexity unless specifications are refactored. This paper describes how clone detection can aid in quality assessments that determine if and where build refactoring effort should be applied. We gauge cloning rates in build systems by collecting and analyzing a benchmark comprising 3,872 build systems. Analysis of the benchmark reveals that: (1) build systems tend to have higher cloning rates than other software artifacts, (2) recent build technologies tend to be more prone to cloning, especially of configuration details like API dependencies, than older technologies, and (3) build systems that have fewer clones achieve higher levels of reuse via mechanisms not offered by build technologies. Our findings aided in refactoring a large industrial build system containing 1.1 million lines.
Identifying Hotspots in Software Build ProcessesShane McIntosh
The document discusses identifying hotspots, or files that frequently change and are costly to rebuild, in software build processes. It presents an approach that constructs a build dependency graph from the build system, analyzes the graph to determine file change frequency and rebuild costs, and detects hotspots using a quadrant plot that highlights files that change often and have high rebuild costs. Case studies on two open source projects found a small number of hotspot files accounted for a majority of rebuild time. Focusing refactoring on hotspots could significantly improve build performance.
Orchestrating Change: An Artistic Representation of Software EvolutionShane McIntosh
Several visualization tools have been proposed to highlight interesting software evolution phenomena. These tools help practitioners to navigate large and complex software systems, and also support researchers in studying software evolution. However, little work has explored the use of sound in the context of software evolution. In this paper, we propose the use of musical interpretation to support exploration of software evolution data. In order to generate music inspired by software evolution, we use parameter-based sonification, i.e., a mapping of dataset characteristics to sound. Our approach yields musical scores that can be played synthetically or by a symphony orchestra. In designing our approach, we address three challenges: (1) the generated music must be aesthetically pleasing, (2) the generated music must accurately reflect the changes that have occurred, and (3) a small group of musicians must be able to impersonate a large development team. We assess the feasibility of our approach using historical data from Eclipse, which yields promising results.
Tracing Software Build Processes to Uncover License Compliance Inconsistencie...Shane McIntosh
This document discusses using build systems to uncover license compliance inconsistencies. It describes how build systems can be traced to construct dependency graphs and annotate components with license information. An empirical study found the approach accurately discovered inconsistencies with 88-100% precision and 98-100% recall. The technique prompted code changes in two systems within a few days to resolve license issues uncovered.
The document discusses the module system of Standard ML. It begins with an introduction to modularity and why it is important for writing large programs. It then provides an overview of Standard ML, including its history, features like higher-order functions and polymorphism. The document focuses on how Standard ML implements modularity using modules. It explains how modules allow hiding implementation details while still allowing controlled sharing between modules. It provides examples of defining modules for stacks and lists to demonstrate abstract types.
UM ESTUDO EMPÍRICO DO USO DA COMUNICAÇÃO PARA CARACTERIZAR A OCORRÊNCIA DE DE...Igor Wiese
[1] Um estudo empírico analisou a comunicação entre desenvolvedores e métricas de histórico de código para caracterizar dependências de mudança no projeto Ruby on Rails. [2] Os resultados mostraram que modelos de aprendizado de máquina podem caracterizar dependências fortes e fracas com alta precisão usando essas métricas. [3] As métricas mais relevantes foram densidade de comentários, centralidade dos desenvolvedores e métricas que medem a estrutura da rede de comunicação.
Identifying Hotspots in the PostgreSQL Build ProcessShane McIntosh
Software developers rely on a fast and correct build system to compile their source code changes and produce modified deliverables for testing and deployment. The scale and complexity of the PostgreSQL build process makes build performance an important topic to discuss and address.
In this talk, we will introduce a new build performance analysis technique that identifies "build hotspots", i.e., files that are slow to rebuild (by analyzing a build dependency graph), yet change often (by analyzing version control history). We will discuss the identified hotspots in the 9.2.4 release of PostgreSQL.
USING STRUCTURAL HOLES METRICS FROM COMMUNICATION NETWORKS TO PREDICT CHANGE ...Igor Wiese
This document examines using structural hole metrics (SHM) from communication networks to predict change dependencies between software artifacts. It finds that SHM can predict change dependencies with an area under the curve over 0.7. Constraint and hierarchy SHM were most important for one project, while commits and updates were most important when including process metrics. The study provides initial evidence that SHM obtained from communication networks can predict change dependencies as suggested by Conway's Law. Future work could explore additional projects, metrics, and comparisons to other software aspects.
The Impact of Code Review Coverage and Participation on Software QualityShane McIntosh
Software code review, i.e., the practice of having third-party team members critique changes to a software system, is a well-established best practice in both open source and proprietary software domains. Prior work has shown that the formal code inspections of the past tend to improve the quality of software delivered by students and small teams. However, the formal code inspection process mandates strict review criteria (e.g., in-person meetings and reviewer checklists) to ensure a base level of review quality, while the modern, lightweight code reviewing process does not. Although recent work explores the modern code review process qualitatively, little research quantitatively explores the relationship between properties of the modern code review process and software quality. Hence, in this paper, we study the relationship between software quality and: (1) code review coverage, i.e., the proportion of changes that have been code reviewed, and (2) code review participation, i.e., the degree of reviewer involvement in the code review process. Through a case study of the Qt, VTK, and ITK projects, we find that both code review coverage and participation share a significant link with software quality. Low code review coverage and participation are estimated to produce components with up to two and five additional post-release defects respectively. Our results empirically confirm the intuition that poorly reviewed code has a negative impact on software quality in large systems using modern reviewing tools.
O documento discute o framework Qt, incluindo suas características, histórico, vantagens e módulos disponíveis. O Qt permite o desenvolvimento de aplicações multiplataforma e possui uma estrutura pronta para iniciar novos projetos.
O que é BIG DATA e como pode influenciar nossas vidasElaine Naomi
O documento discute o Big Data, definido como conjuntos de dados difíceis de capturar, armazenar, analisar e visualizar com tecnologias atuais. Apresenta estatísticas sobre a quantidade de dados gerados diariamente e como empresas como Google, Facebook e Netflix usam a análise de Big Data. Também aborda possíveis aplicações em diagnósticos médicos, educação e matching de parceiros, bem como riscos à privacidade.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
SOCIAL METRICS INCLUDED IN PREDICTION MODELS ON SOFTWARE ENGINEERING: A MAPPI...Igor Wiese
This document discusses a mapping study on the use of social metrics in software engineering prediction models. It aims to identify which social metrics have been used and whether they had a positive effect. The study found that social metrics were often classified under other dimensions and there was inconsistent terminology. It identified papers reporting on various social metrics and grouped them into categories and sub-categories. The results showed that most papers reported social metrics had a positive effect in prediction models, while some reported negative or neutral effects. The conclusions note more research is needed on social metrics in different contexts and using larger datasets.
Collecting and Leveraging a Benchmark of Build System Clones to Aid in Qualit...Shane McIntosh
Build systems specify how sources are transformed into deliverables, and hence must be carefully maintained to ensure that deliverables are assembled correctly. Similar to source code, build systems tend to grow in complexity unless specifications are refactored. This paper describes how clone detection can aid in quality assessments that determine if and where build refactoring effort should be applied. We gauge cloning rates in build systems by collecting and analyzing a benchmark comprising 3,872 build systems. Analysis of the benchmark reveals that: (1) build systems tend to have higher cloning rates than other software artifacts, (2) recent build technologies tend to be more prone to cloning, especially of configuration details like API dependencies, than older technologies, and (3) build systems that have fewer clones achieve higher levels of reuse via mechanisms not offered by build technologies. Our findings aided in refactoring a large industrial build system containing 1.1 million lines.
Identifying Hotspots in Software Build ProcessesShane McIntosh
The document discusses identifying hotspots, or files that frequently change and are costly to rebuild, in software build processes. It presents an approach that constructs a build dependency graph from the build system, analyzes the graph to determine file change frequency and rebuild costs, and detects hotspots using a quadrant plot that highlights files that change often and have high rebuild costs. Case studies on two open source projects found a small number of hotspot files accounted for a majority of rebuild time. Focusing refactoring on hotspots could significantly improve build performance.
Orchestrating Change: An Artistic Representation of Software EvolutionShane McIntosh
Several visualization tools have been proposed to highlight interesting software evolution phenomena. These tools help practitioners to navigate large and complex software systems, and also support researchers in studying software evolution. However, little work has explored the use of sound in the context of software evolution. In this paper, we propose the use of musical interpretation to support exploration of software evolution data. In order to generate music inspired by software evolution, we use parameter-based sonification, i.e., a mapping of dataset characteristics to sound. Our approach yields musical scores that can be played synthetically or by a symphony orchestra. In designing our approach, we address three challenges: (1) the generated music must be aesthetically pleasing, (2) the generated music must accurately reflect the changes that have occurred, and (3) a small group of musicians must be able to impersonate a large development team. We assess the feasibility of our approach using historical data from Eclipse, which yields promising results.
Tracing Software Build Processes to Uncover License Compliance Inconsistencie...Shane McIntosh
This document discusses using build systems to uncover license compliance inconsistencies. It describes how build systems can be traced to construct dependency graphs and annotate components with license information. An empirical study found the approach accurately discovered inconsistencies with 88-100% precision and 98-100% recall. The technique prompted code changes in two systems within a few days to resolve license issues uncovered.
The document discusses the module system of Standard ML. It begins with an introduction to modularity and why it is important for writing large programs. It then provides an overview of Standard ML, including its history, features like higher-order functions and polymorphism. The document focuses on how Standard ML implements modularity using modules. It explains how modules allow hiding implementation details while still allowing controlled sharing between modules. It provides examples of defining modules for stacks and lists to demonstrate abstract types.
UM ESTUDO EMPÍRICO DO USO DA COMUNICAÇÃO PARA CARACTERIZAR A OCORRÊNCIA DE DE...Igor Wiese
[1] Um estudo empírico analisou a comunicação entre desenvolvedores e métricas de histórico de código para caracterizar dependências de mudança no projeto Ruby on Rails. [2] Os resultados mostraram que modelos de aprendizado de máquina podem caracterizar dependências fortes e fracas com alta precisão usando essas métricas. [3] As métricas mais relevantes foram densidade de comentários, centralidade dos desenvolvedores e métricas que medem a estrutura da rede de comunicação.
Identifying Hotspots in the PostgreSQL Build ProcessShane McIntosh
Software developers rely on a fast and correct build system to compile their source code changes and produce modified deliverables for testing and deployment. The scale and complexity of the PostgreSQL build process makes build performance an important topic to discuss and address.
In this talk, we will introduce a new build performance analysis technique that identifies "build hotspots", i.e., files that are slow to rebuild (by analyzing a build dependency graph), yet change often (by analyzing version control history). We will discuss the identified hotspots in the 9.2.4 release of PostgreSQL.
USING STRUCTURAL HOLES METRICS FROM COMMUNICATION NETWORKS TO PREDICT CHANGE ...Igor Wiese
This document examines using structural hole metrics (SHM) from communication networks to predict change dependencies between software artifacts. It finds that SHM can predict change dependencies with an area under the curve over 0.7. Constraint and hierarchy SHM were most important for one project, while commits and updates were most important when including process metrics. The study provides initial evidence that SHM obtained from communication networks can predict change dependencies as suggested by Conway's Law. Future work could explore additional projects, metrics, and comparisons to other software aspects.
The Impact of Code Review Coverage and Participation on Software QualityShane McIntosh
Software code review, i.e., the practice of having third-party team members critique changes to a software system, is a well-established best practice in both open source and proprietary software domains. Prior work has shown that the formal code inspections of the past tend to improve the quality of software delivered by students and small teams. However, the formal code inspection process mandates strict review criteria (e.g., in-person meetings and reviewer checklists) to ensure a base level of review quality, while the modern, lightweight code reviewing process does not. Although recent work explores the modern code review process qualitatively, little research quantitatively explores the relationship between properties of the modern code review process and software quality. Hence, in this paper, we study the relationship between software quality and: (1) code review coverage, i.e., the proportion of changes that have been code reviewed, and (2) code review participation, i.e., the degree of reviewer involvement in the code review process. Through a case study of the Qt, VTK, and ITK projects, we find that both code review coverage and participation share a significant link with software quality. Low code review coverage and participation are estimated to produce components with up to two and five additional post-release defects respectively. Our results empirically confirm the intuition that poorly reviewed code has a negative impact on software quality in large systems using modern reviewing tools.
O documento discute o framework Qt, incluindo suas características, histórico, vantagens e módulos disponíveis. O Qt permite o desenvolvimento de aplicações multiplataforma e possui uma estrutura pronta para iniciar novos projetos.
O que é BIG DATA e como pode influenciar nossas vidasElaine Naomi
O documento discute o Big Data, definido como conjuntos de dados difíceis de capturar, armazenar, analisar e visualizar com tecnologias atuais. Apresenta estatísticas sobre a quantidade de dados gerados diariamente e como empresas como Google, Facebook e Netflix usam a análise de Big Data. Também aborda possíveis aplicações em diagnósticos médicos, educação e matching de parceiros, bem como riscos à privacidade.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Full-RAG: A modern architecture for hyper-personalization
Editor's Notes
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
Cars have features like AC and Power windows...\nSoftware has features... Linux -> CPU freq scaling, pluggable module support\nYou need tools to build a car, and you also need tools...compilers and linkers...\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
devs: constantly have to rebuilt artifacts to test changes... bld sys incrementally updates builds\n\n\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Firefox 3.0 was built and delivered incorrectly\nusers in a networked environment address/search bar broken\ndue to an incorrect version of the SQLite library being linked in build process\nFix was delivered 4 months late in a service pack (3.0.1)\n
Based on survey results...\n
\n
Studied 10 projects of different:\n- prog language (c, c++, java)\n- domain (web communications, compilers, RDBMS, IDE...)\n- dev methodology (open, proprietary)\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
We find that the build system accounts for 9% of the build files in a project (median).\n- Small!\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
1) churn of the build system is close to that of the source code (normalized)\n2) Prior research => High churn may produce bugs!\n
1) churn of the build system is close to that of the source code (normalized)\n2) Prior research => High churn may produce bugs!\n
1) churn of the build system is close to that of the source code (normalized)\n2) Prior research => High churn may produce bugs!\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
- measures strength of implication...\n
- measures strength of implication...\n
- measures strength of implication...\n
- measures strength of implication...\n
- measures strength of implication...\n
- measures strength of implication...\n
- measures strength of implication...\n
Counter intuitive and does not match with prior survey results (<12%)\n
- Mozilla is scary high\n- Eclipse-core and Jazz are considerably lower\n- Use of PDE build\n
- Mozilla is scary high\n- Eclipse-core and Jazz are considerably lower\n- Use of PDE build\n
- Mozilla is scary high\n- Eclipse-core and Jazz are considerably lower\n- Use of PDE build\n
- Mozilla is scary high\n- Eclipse-core and Jazz are considerably lower\n- Use of PDE build\n
\n
\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- devs may not commit all changes in one revision\n
- Jazz src devs are often responsible for build dev\n- Git and Linux are less so\n
- Jazz distributes build work...\n
- Jazz distributes build work...\n
- Jazz distributes build work...\n
- Jazz distributes build work...\n
\n
Build maintenance is a nuisance for developers\nneed tools to help them to cope\ne.g., Jazz has tools, lowered build coupling to 4%\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
Build maintenance is a nuisance for developers!\nneed tools to help them to cope!\n