This document summarizes a technique for automatically segmenting source code identifiers into meaningful words. It presents a search-based approach inspired by how developers compose identifiers using terms and applying word transformations. The approach uses a dictionary of terms, calculates the distance between an identifier and dictionary words using Dynamic Time Warping, and applies word transformation rules. An evaluation on two systems found it outperformed a simple CamelCase splitter, correctly splitting over 90% of identifiers. Future work is planned to expand the evaluation and enhance heuristics for term selection and transformations.
Este documento describe la genรณmica y la proteรณmica. La genรณmica usa la hibridaciรณn para identificar genes codificadores de proteรญnas, genes que interactรบan entre sรญ y similitudes entre genomas. La proteรณmica estudia el conjunto de proteรญnas de una cรฉlula. Ambas ciencias usan microarrays para estudiar la expresiรณn de genes en diferentes condiciones como cรฉlulas tumorales.
Webinar - Microlearning: Getting Started Raptivity
ย
Raptivity has conducted a webinar on Microlearning on 20 January.This webinar talked about a holistic view of the microlearning landscape - typical characteristics, deployment and popular delivery methods of Microlearning. Raptivity Thought Leader Todd Kasenberg (Principal, Guiding Star Communications and Consulting) had shared objective and practical insights on microlearning development.
Este documento habla sobre blogs, SlideShare y Delicious. Explica que un blog es un sitio web donde uno o mรกs autores publican contenido de forma periรณdica y cronolรณgica. Luego menciona algunos usos comunes de blogs como diarios personales, enseรฑanza y marketing. Tambiรฉn describe a SlideShare como un sitio para compartir presentaciones y documentos, y a Delicious como un servicio para organizar y compartir marcadores socialmente mediante etiquetas.
Here's the powerpoint from my presentation at the Cornell Small Farms "Telling Better Stories" workshop. For more information please visit http://FarmMarketingSolutions.com
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise boosts blood flow and levels of neurotransmitters and endorphins which elevate and stabilize mood.
See what TESCO manufactures for Electric Utilities. Our products have been known for reliability and ruggedness for over 100 years. This reputation continues today, making TESCO the preferred supplier for utilities across North America
Este documento describe la genรณmica y la proteรณmica. La genรณmica usa la hibridaciรณn para identificar genes codificadores de proteรญnas, genes que interactรบan entre sรญ y similitudes entre genomas. La proteรณmica estudia el conjunto de proteรญnas de una cรฉlula. Ambas ciencias usan microarrays para estudiar la expresiรณn de genes en diferentes condiciones como cรฉlulas tumorales.
Webinar - Microlearning: Getting Started Raptivity
ย
Raptivity has conducted a webinar on Microlearning on 20 January.This webinar talked about a holistic view of the microlearning landscape - typical characteristics, deployment and popular delivery methods of Microlearning. Raptivity Thought Leader Todd Kasenberg (Principal, Guiding Star Communications and Consulting) had shared objective and practical insights on microlearning development.
Este documento habla sobre blogs, SlideShare y Delicious. Explica que un blog es un sitio web donde uno o mรกs autores publican contenido de forma periรณdica y cronolรณgica. Luego menciona algunos usos comunes de blogs como diarios personales, enseรฑanza y marketing. Tambiรฉn describe a SlideShare como un sitio para compartir presentaciones y documentos, y a Delicious como un servicio para organizar y compartir marcadores socialmente mediante etiquetas.
Here's the powerpoint from my presentation at the Cornell Small Farms "Telling Better Stories" workshop. For more information please visit http://FarmMarketingSolutions.com
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise boosts blood flow and levels of neurotransmitters and endorphins which elevate and stabilize mood.
See what TESCO manufactures for Electric Utilities. Our products have been known for reliability and ruggedness for over 100 years. This reputation continues today, making TESCO the preferred supplier for utilities across North America
This document discusses an approach to automatically segment source code identifiers into words by:
1) Modeling how developers compose identifiers using terms from a dictionary and transformation rules;
2) Aligning unknown strings to dictionary words using a modified Dynamic Time Warping technique;
3) Applying a hill climbing algorithm using random word transformations to split identifiers.
The approach was evaluated on two systems and achieved a 95% correct splitting rate, outperforming a basic camel case splitter. Future work aims to improve heuristics and contextualize the search within code structures.
This paper proposes a heuristic-based approach to identify concepts in execution traces of software systems. The approach analyzes traces generated through dynamic analysis and groups method calls that are sequentially invoked and conceptually cohesive. It applies several steps: system instrumentation to collect traces, trace pruning and compression, analyzing method source code for cohesion/coupling, and using a genetic algorithm to identify concepts. An empirical study on two systems found the algorithm converges with 72-84% overlap across runs. Identified concepts matched the manual oracle with high precision, though some features were more difficult to distinguish. Inspection of segments revealed they sometimes only captured parts of features due to complexity.
This document presents a heuristic-based approach to identify concepts in execution traces. The approach analyzes execution traces and groups method calls that are sequentially invoked together and conceptually cohesive. It proceeds in 5 steps: system instrumentation to collect traces, trace pruning and compression, textual analysis of method code, and a search-based concept identification using genetic algorithms. An empirical study on two systems found the approach produces stable results across runs and identifies concepts with high precision, though it has limitations for features affected by multi-threading or GUI events.
TRIS is a fast and accurate approach for splitting and expanding source code identifiers. It treats identifier splitting as an optimization problem and uses a two-phase strategy: 1) building a dictionary of terms and transformation rules, and 2) applying these to split identifiers. A case study found TRIS outperformed other approaches like Camel Case splitting, Samurai, and TIDIER in accuracy on two software projects. TRIS also had higher accuracy than Samurai and GenTest on a large dataset of identifier split correctness.
20 issues of porting C++ code on the 64-bit platformAndrey Karpov
ย
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
This article contains various examples of 64-bit errors. However, we have learnt much more examples and types of errors since we started writing the article and they were not included into it. Please see the article "A Collection of Examples of 64-bit Errors in Real Programs" that covers defects in 64-bit programs we know of most thoroughly. We also recommend you to study the course "Lessons on development of 64-bit C/C++ applications" where we describe the methodology of creating correct 64-bit code and searching for all types of defects using the Viva64 code analyzer.
20 issues of porting C++ code on the 64-bit platformPVS-Studio
ย
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
Program errors occurring while porting C++ code from 32-bit platforms on 64-b...Andrey Karpov
ย
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
This document presents a domain analysis method called DECOR to specify design defects and generate detection algorithms. It involves performing a domain analysis to develop a vocabulary and taxonomy of design defects. Design defects are then specified using a domain-specific language called SADSL. These specifications are used to generate detection algorithms, which are then validated by applying them to source code to detect classes with design defects. The goal is to explicitly specify design defects to improve detection over existing tools.
130817 latifa guerrouj - context-aware source code vocabulary normalization...Ptidej Team
ย
This document summarizes the contributions of Latifa Guerrouj's PhD thesis on context-aware source code vocabulary normalization. The thesis introduced two context-aware approaches for vocabulary normalization: TIDIER and TRIS. TIDIER is inspired by speech recognition and uses context-aware dictionaries and hill climbing to normalize identifiers. Experiments showed TIDIER outperformed previous approaches and correctly mapped 48% of abbreviations. TRIS treats normalization as an optimization problem to minimize a cost function. Experiments found TRIS had higher accuracy than state-of-the-art approaches like CamelCase and Samurai, with a medium to large effect size on C code.
The document discusses the history and current state of software engineering and its application to IoT systems. It notes that 50 years after the earliest software projects, issues still include cost overruns, property damage, risks to life and death, and challenges ensuring quality. For IoT, fragmentation across hardware, software, APIs and standards poses significant problems. The document proposes that research into IoT software engineering could help address these issues through approaches like developing software to run across diverse IoT platforms, and automatically miniaturizing software through techniques like multi-objective optimization to suit different IoT device capabilities.
1) Issue trackers are often used to track more than just bugs, including features, enhancements, and refactoring work.
2) A manual analysis found that nearly half of issues labeled as "bugs" in issue trackers were actually not bugs.
3) Relying on issue tracker labels alone can introduce significant errors into datasets used for tasks like bug prediction and severity estimation. More work is needed to clean noisy and unreliable data.
The document discusses how to derive dependency structures for legacy J2EE applications. It proposes analyzing all application tiers together using a language-independent model and parsing various artifacts. Configuration files and limited data flow analysis are used to understand dependencies. Container dependencies are explicitly codified by studying technology specifications and codifying dependency rules to apply when certain code patterns are detected in applications. This allows completing an application's dependency graph.
The document discusses the state of practices of service identification in the industry for migrating legacy systems to service-oriented architectures (SOA). It finds that while service identification is seen as important, it remains primarily a manual process focused on identifying coarse-grained business services from source code and business processes. Wrapping and clustering functionalities are common techniques. Fully automating service identification is still challenging due to the need to understand complex legacy system dependencies. The document recommends service identification be business-driven and follow proven methodologies.
This document discusses techniques for testing advanced driver assistance systems (ADAS) through physics-based simulation. It faces challenges due to the large, complex, and multidimensional test input space as well as the computational expense of simulation. The document proposes using a genetic algorithm guided by decision trees to more efficiently search for critical test cases. Classification trees are built to partition the input space into homogeneous regions in order to better guide the selection and generation of test inputs toward more critical areas.
The document reports on the findings of a survey of 45 industrial practitioners on their experiences with legacy-to-SOA migrations. The key findings include: 1) Practitioners migrate legacy systems implemented in Cobol and Java to reduce maintenance costs and improve flexibility/interoperability; 2) Identifying services is an important step but is mostly manual and business-driven; 3) The most used techniques are functionality clustering and wrapping; 4) Desired service qualities are reusability, granularity and loose coupling; 5) Identified services prioritize domain-specific over technical services; 6) RESTful services are most targeted technology.
The document investigates the impact of linguistic anti-patterns (LAs) on program comprehension. It defines LAs as bad naming, documentation, and implementation practices. A study was conducted involving 92 students assessing programs with and without LAs. The study found that LAs negatively impact understandability by increasing time and reducing correctness. Certain LAs like A2, B4, and D1 had a stronger negative effect than others like E1. The study also found that providing knowledge about LAs can help mitigate their impact by making programs easier and faster to comprehend.
The document discusses research on identifying and analyzing the impact of patterns on the quality of multi-language systems. The objectives are to collect and categorize sets of programming languages used together, detect patterns in multi-language programs to track bugs and provide best practices, and study how patterns impact quality. The contributions will be a catalog of multi-language patterns and defects, a detection tool, and analysis of patterns' effects on quality attributes. Current work includes reviewing literature on language combinations and patterns to provide recommendations for high-quality multi-language development.
This document discusses research on change impact analysis in multi-language systems. It begins by outlining recommendations for best practices when using JNI, such as passing primitive types, minimizing calls between native and Java code, and properly handling strings. It then describes a qualitative analysis of JNI usage that identified common practices and issues. Finally, it proposes future work to survey developers on applying recommendations to facilitate change impact analysis in multi-language systems.
The document summarizes a recommendation system that suggests software processes for video game projects based on similarities to past projects. The system analyzes over 100 postmortems from previous games to build a database of development processes and project contexts. It uses principal component analysis to identify similar past projects and recommends a process by combining elements from similar projects' processes. The system was evaluated both quantitatively based on correctness and coverage metrics and qualitatively through surveys and a case study with a developer team.
Will io t trigger the next software crisisPtidej Team
ย
This document discusses how the rise of the Internet of Things (IoT) could trigger a new software crisis due to issues like fragmentation, complexity, and lack of standards. It provides a brief history of software engineering challenges over the past 50 years such as cost overruns, safety issues, and prioritizing productivity over quality. The document then examines how these same problems are emerging in the IoT context today. It argues that IoT software engineering practices need to address issues like device software, cloud/app development, and privacy in order to avoid a major crisis.
This document discusses theories related to software design patterns. It notes that while design patterns are commonly used, there is a need for more research on how they impact software quality. The document proposes several areas for developing theories, including systematically categorizing existing patterns based on underlying principles, combining principles to identify new patterns, and developing theories of patterns from developer behavior and for building software systems. Formalizing patterns and identifying their relationships could help teaching and understanding of patterns.
Laleh M. Eshkevari defended her Ph.D dissertation on developing techniques for the automatic detection and classification of identifier renamings in software projects. Her dissertation outlined a taxonomy of renamings, described approaches for renaming detection based on line mapping, entity mapping and data flow analysis, and discussed methods for classifying renamings based on their form and semantic changes. Evaluation of the approaches on several open source projects showed high precision and recall for renaming detection and identified trends in how renamings are used in practice.
This document discusses an approach to automatically segment source code identifiers into words by:
1) Modeling how developers compose identifiers using terms from a dictionary and transformation rules;
2) Aligning unknown strings to dictionary words using a modified Dynamic Time Warping technique;
3) Applying a hill climbing algorithm using random word transformations to split identifiers.
The approach was evaluated on two systems and achieved a 95% correct splitting rate, outperforming a basic camel case splitter. Future work aims to improve heuristics and contextualize the search within code structures.
This paper proposes a heuristic-based approach to identify concepts in execution traces of software systems. The approach analyzes traces generated through dynamic analysis and groups method calls that are sequentially invoked and conceptually cohesive. It applies several steps: system instrumentation to collect traces, trace pruning and compression, analyzing method source code for cohesion/coupling, and using a genetic algorithm to identify concepts. An empirical study on two systems found the algorithm converges with 72-84% overlap across runs. Identified concepts matched the manual oracle with high precision, though some features were more difficult to distinguish. Inspection of segments revealed they sometimes only captured parts of features due to complexity.
This document presents a heuristic-based approach to identify concepts in execution traces. The approach analyzes execution traces and groups method calls that are sequentially invoked together and conceptually cohesive. It proceeds in 5 steps: system instrumentation to collect traces, trace pruning and compression, textual analysis of method code, and a search-based concept identification using genetic algorithms. An empirical study on two systems found the approach produces stable results across runs and identifies concepts with high precision, though it has limitations for features affected by multi-threading or GUI events.
TRIS is a fast and accurate approach for splitting and expanding source code identifiers. It treats identifier splitting as an optimization problem and uses a two-phase strategy: 1) building a dictionary of terms and transformation rules, and 2) applying these to split identifiers. A case study found TRIS outperformed other approaches like Camel Case splitting, Samurai, and TIDIER in accuracy on two software projects. TRIS also had higher accuracy than Samurai and GenTest on a large dataset of identifier split correctness.
20 issues of porting C++ code on the 64-bit platformAndrey Karpov
ย
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
This article contains various examples of 64-bit errors. However, we have learnt much more examples and types of errors since we started writing the article and they were not included into it. Please see the article "A Collection of Examples of 64-bit Errors in Real Programs" that covers defects in 64-bit programs we know of most thoroughly. We also recommend you to study the course "Lessons on development of 64-bit C/C++ applications" where we describe the methodology of creating correct 64-bit code and searching for all types of defects using the Viva64 code analyzer.
20 issues of porting C++ code on the 64-bit platformPVS-Studio
ย
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
Program errors occurring while porting C++ code from 32-bit platforms on 64-b...Andrey Karpov
ย
Program errors occurring while porting C++ code from 32-bit platforms on 64-bit ones are observed. Examples of the incorrect code and the ways to correct it are given. Methods and means of the code analysis which allow to diagnose the errors discussed, are listed.
This document presents a domain analysis method called DECOR to specify design defects and generate detection algorithms. It involves performing a domain analysis to develop a vocabulary and taxonomy of design defects. Design defects are then specified using a domain-specific language called SADSL. These specifications are used to generate detection algorithms, which are then validated by applying them to source code to detect classes with design defects. The goal is to explicitly specify design defects to improve detection over existing tools.
130817 latifa guerrouj - context-aware source code vocabulary normalization...Ptidej Team
ย
This document summarizes the contributions of Latifa Guerrouj's PhD thesis on context-aware source code vocabulary normalization. The thesis introduced two context-aware approaches for vocabulary normalization: TIDIER and TRIS. TIDIER is inspired by speech recognition and uses context-aware dictionaries and hill climbing to normalize identifiers. Experiments showed TIDIER outperformed previous approaches and correctly mapped 48% of abbreviations. TRIS treats normalization as an optimization problem to minimize a cost function. Experiments found TRIS had higher accuracy than state-of-the-art approaches like CamelCase and Samurai, with a medium to large effect size on C code.
The document discusses the history and current state of software engineering and its application to IoT systems. It notes that 50 years after the earliest software projects, issues still include cost overruns, property damage, risks to life and death, and challenges ensuring quality. For IoT, fragmentation across hardware, software, APIs and standards poses significant problems. The document proposes that research into IoT software engineering could help address these issues through approaches like developing software to run across diverse IoT platforms, and automatically miniaturizing software through techniques like multi-objective optimization to suit different IoT device capabilities.
1) Issue trackers are often used to track more than just bugs, including features, enhancements, and refactoring work.
2) A manual analysis found that nearly half of issues labeled as "bugs" in issue trackers were actually not bugs.
3) Relying on issue tracker labels alone can introduce significant errors into datasets used for tasks like bug prediction and severity estimation. More work is needed to clean noisy and unreliable data.
The document discusses how to derive dependency structures for legacy J2EE applications. It proposes analyzing all application tiers together using a language-independent model and parsing various artifacts. Configuration files and limited data flow analysis are used to understand dependencies. Container dependencies are explicitly codified by studying technology specifications and codifying dependency rules to apply when certain code patterns are detected in applications. This allows completing an application's dependency graph.
The document discusses the state of practices of service identification in the industry for migrating legacy systems to service-oriented architectures (SOA). It finds that while service identification is seen as important, it remains primarily a manual process focused on identifying coarse-grained business services from source code and business processes. Wrapping and clustering functionalities are common techniques. Fully automating service identification is still challenging due to the need to understand complex legacy system dependencies. The document recommends service identification be business-driven and follow proven methodologies.
This document discusses techniques for testing advanced driver assistance systems (ADAS) through physics-based simulation. It faces challenges due to the large, complex, and multidimensional test input space as well as the computational expense of simulation. The document proposes using a genetic algorithm guided by decision trees to more efficiently search for critical test cases. Classification trees are built to partition the input space into homogeneous regions in order to better guide the selection and generation of test inputs toward more critical areas.
The document reports on the findings of a survey of 45 industrial practitioners on their experiences with legacy-to-SOA migrations. The key findings include: 1) Practitioners migrate legacy systems implemented in Cobol and Java to reduce maintenance costs and improve flexibility/interoperability; 2) Identifying services is an important step but is mostly manual and business-driven; 3) The most used techniques are functionality clustering and wrapping; 4) Desired service qualities are reusability, granularity and loose coupling; 5) Identified services prioritize domain-specific over technical services; 6) RESTful services are most targeted technology.
The document investigates the impact of linguistic anti-patterns (LAs) on program comprehension. It defines LAs as bad naming, documentation, and implementation practices. A study was conducted involving 92 students assessing programs with and without LAs. The study found that LAs negatively impact understandability by increasing time and reducing correctness. Certain LAs like A2, B4, and D1 had a stronger negative effect than others like E1. The study also found that providing knowledge about LAs can help mitigate their impact by making programs easier and faster to comprehend.
The document discusses research on identifying and analyzing the impact of patterns on the quality of multi-language systems. The objectives are to collect and categorize sets of programming languages used together, detect patterns in multi-language programs to track bugs and provide best practices, and study how patterns impact quality. The contributions will be a catalog of multi-language patterns and defects, a detection tool, and analysis of patterns' effects on quality attributes. Current work includes reviewing literature on language combinations and patterns to provide recommendations for high-quality multi-language development.
This document discusses research on change impact analysis in multi-language systems. It begins by outlining recommendations for best practices when using JNI, such as passing primitive types, minimizing calls between native and Java code, and properly handling strings. It then describes a qualitative analysis of JNI usage that identified common practices and issues. Finally, it proposes future work to survey developers on applying recommendations to facilitate change impact analysis in multi-language systems.
The document summarizes a recommendation system that suggests software processes for video game projects based on similarities to past projects. The system analyzes over 100 postmortems from previous games to build a database of development processes and project contexts. It uses principal component analysis to identify similar past projects and recommends a process by combining elements from similar projects' processes. The system was evaluated both quantitatively based on correctness and coverage metrics and qualitatively through surveys and a case study with a developer team.
Will io t trigger the next software crisisPtidej Team
ย
This document discusses how the rise of the Internet of Things (IoT) could trigger a new software crisis due to issues like fragmentation, complexity, and lack of standards. It provides a brief history of software engineering challenges over the past 50 years such as cost overruns, safety issues, and prioritizing productivity over quality. The document then examines how these same problems are emerging in the IoT context today. It argues that IoT software engineering practices need to address issues like device software, cloud/app development, and privacy in order to avoid a major crisis.
This document discusses theories related to software design patterns. It notes that while design patterns are commonly used, there is a need for more research on how they impact software quality. The document proposes several areas for developing theories, including systematically categorizing existing patterns based on underlying principles, combining principles to identify new patterns, and developing theories of patterns from developer behavior and for building software systems. Formalizing patterns and identifying their relationships could help teaching and understanding of patterns.
Laleh M. Eshkevari defended her Ph.D dissertation on developing techniques for the automatic detection and classification of identifier renamings in software projects. Her dissertation outlined a taxonomy of renamings, described approaches for renaming detection based on line mapping, entity mapping and data flow analysis, and discussed methods for classifying renamings based on their form and semantic changes. Evaluation of the approaches on several open source projects showed high precision and recall for renaming detection and identified trends in how renamings are used in practice.
1) The document analyzes the co-occurrence of code smells like anti-patterns and clones in software systems and their impact on fault-proneness.
2) It finds that over 50% of classes with anti-patterns also have clones, and 59-78% of classes with clones also participate in anti-patterns.
3) Classes with both anti-patterns and clones are significantly more fault-prone than other classes, with the risk of faults being at least 7 times higher in one system studied.
Trustrace is an approach that uses software repository links like SVN commits to improve the trust in automatically recovered traceability links between requirements and code. It calculates an initial trust value for links based on IR techniques like VSM, and then reweights the links based on additional information from the software repository. An evaluation on two case studies found Trustrace improved precision over VSM alone and showed no significant difference in recall, supporting the hypothesis that Trustrace can improve link recovery accuracy over IR-only approaches.
The document presents a taxonomy called ProMeTA for classifying program metamodels used in program reverse engineering. ProMeTA defines characteristics such as target language, abstraction level, meta-language, and more to classify popular metamodels like AST, KDM, FAMIX. The taxonomy aims to provide a comprehensive guide for researchers and practitioners to select, design, and communicate metamodels. The paper also analyzes existing metamodels according to the ProMeTA taxonomy and identifies gaps to guide future metamodel development.
This document describes a controlled, multiple case study of software evolution and defects from industrial projects. It details the data sources used, including source code repositories, issue tracking databases, and interviews. Metrics such as code smells, size, effort, and defects were collected. Programming skills of developers were also measured. Code smell detection tools and custom scripts to analyze code changes were used to extract metrics on a variety of code issues and evolution over time. The data is available online for further analysis.
The document describes a study on detecting linguistic (anti)patterns in RESTful APIs. It presents an approach called DOLAR (Detection Of Linguistic Antipatterns in REST) that analyzes REST API URIs and detects antipatterns using heuristics-based algorithms. Experiments were conducted on 309 methods from 15 public REST APIs to test DOLAR's accuracy, the extensibility of the underlying SOFA framework, and the performance of detection algorithms. The results showed that 42% of methods exhibited contextualized resource names (a pattern) while 14% had contextless resource names (an antipattern), with detection taking under a second on average.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
ย
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analyticsโ feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
ย
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
How to Make a Field Mandatory in Odoo 17Celine George
ย
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
ย
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
The chapter Lifelines of National Economy in Class 10 Geography focuses on the various modes of transportation and communication that play a vital role in the economic development of a country. These lifelines are crucial for the movement of goods, services, and people, thereby connecting different regions and promoting economic activities.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
ย
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
ย
(๐๐๐ ๐๐๐) (๐๐๐ฌ๐ฌ๐จ๐ง ๐)-๐๐ซ๐๐ฅ๐ข๐ฆ๐ฌ
๐๐ข๐ฌ๐๐ฎ๐ฌ๐ฌ ๐ญ๐ก๐ ๐๐๐ ๐๐ฎ๐ซ๐ซ๐ข๐๐ฎ๐ฅ๐ฎ๐ฆ ๐ข๐ง ๐ญ๐ก๐ ๐๐ก๐ข๐ฅ๐ข๐ฉ๐ฉ๐ข๐ง๐๐ฌ:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
๐๐ฑ๐ฉ๐ฅ๐๐ข๐ง ๐ญ๐ก๐ ๐๐๐ญ๐ฎ๐ซ๐ ๐๐ง๐ ๐๐๐จ๐ฉ๐ ๐จ๐ ๐๐ง ๐๐ง๐ญ๐ซ๐๐ฉ๐ซ๐๐ง๐๐ฎ๐ซ:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
1. Recognizing Words from Source Code
Identifiers using Speech Recognition
Techniques
CSMR 2010, Madrid
Nioosha Madani, Latifa Guerrouj, Massimiliano Di Penta,
Yann-Gaรซl Guรฉhรฉneuc, and Giuliano Antoniol
2. Content
Problem Statement
Aligning Strings and Words
Meta-
Meta-heuristic Inspired Approach
Technologies
Case Study โ Research Questions
Case Study โ Results
CSMR 2010, Madrid Conclusion and Future Work
2/24
3. Problem Statement
The Challenge
A few years after deployment, documentation may
no longer exist.
If it exists, it will be almost surely outdated.
My customers desire to change the system, add
new functionalities or fix a defect.
The only available source of information is the
code:
Identifiers;
Comments.
CSMR 2010, Madrid
3/24
4. Problem Statement
Identifiers Semantic
Researchers agree that the identifier semantics are
important:
Help program comprehension;
Suggest clues.
Composed identifiers:
Camel Case: MyLocalAccount , User_Address
Contraction based: pntrctr , usrAdrss , imagEdge
Good and possibly known to the developers:
hmmm, ixoth , pqrstuvwxyz
CSMR 2010, Madrid
4/24
5. Problem Statement
Words, Terms, Soft, and Hard Words
Term: any substring in a compound identifier.
Word: an entry in a dictionary (e.g., the English
dictionary).
Hard words: terms composing an identifier reflecting
domain concepts, clearly demarked:
baseAddress,
baseAddress, user_file
Soft words: terms different from the identifier and not
clearly demarked (e.g., abbreviation, contraction,
etc.):
CSMR 2010, Madrid userarea, ptrcntr,
userarea, ptrcntr, userGid
5/24
6. Problem Statement
Current Practices
Camel Case-based approaches plus greedy
Case-
algorithms, e.g., Lawrie et al. 2006, 2007.
Samurai by Enslen et al, 2009:
Lexicon plus a greedy algorithm;
If a contraction is used somewhere in the code then it is
likely used in the same context than the original term;
Frequency tables of contractions and terms to split
composed identifiers.
Limitations : Abbreviations not treated, no
quantification of how close the match is to the
CSMR 2010, Madrid
unknown string.
6/24
7. Problem Statement
Our Approach in Essence
Developers compose identifiers:
Using terms and words reflecting domain concepts,
developerโs experience, knowledge.
Developers generate contraction via a finite set of
transformation rules:
Drop all vowels, drop prefix, drop suffix, etc.
Mimics developerโs identifiers generation process:
Dictionaries capturing terms and words;
A search-based technique to split exactly any unknown
string;
A distance using Dynamic Time Warping (DTW) for
CSMR 2010, Madrid
continuous speech recognition [H. Ney, 1984].
7/24
8. Aligning Strings and
Words
Modified H. Ney DTW
3 5 4 0
U s r
4 5 4 3 2 1
2 3 4 3 4 3 3 1 0 1
3
Dictionary of 3 words
r 1 2 2 3 2 2 0 1 2
3 4 5 4 2 1 0 3 4 5
t
2 3 4 3 1 0 1 2 3 4
1 2 3 2 0 1 2 1 2 3
C
3 0
r
3 2 3 3 2 4 5 4
2 2 0 1 2 2 3 3 4 4
P n t
1 0 1 2 2 3 3 2 3 3
0 1 2 3 1 2 2 1 2 2
CSMR 2010, Madrid p n t r c t r u s r
Identifier to split : pntrctrusr
8/24
9. Meta-heuristic Inspired
Approach
Word Transformation Rules
Constraint: String must remain longer or equal to 3 chars
Drop all vowels pointer โ pntr
Drop a random vowel user โ usr
Drop a random character pntr โ ptr
Drop suffix (ing, tion, ed, available โ avail
ment, able)
Drop the last m characters rectangle โ rect
CSMR 2010, Madrid
9/24
10. - Meta-heuristic Inspired
Approach
-Technologies Overall Splitting (Hill Climbing) Procedure
Identifier DTW Match
Best Matching
Success!
Zero Dist?
No
Select randomly a
word with a minimal
distance <> 0
Apply a random
transformation to the Add transf word to
chosen word temporary
dictionary
Current dictionary
yes
Discard word Best Matching DTW
red Dist ?
CSMR 2010, Madrid from temporary Match
No
dictionary If other transf to apply
10/24
11. Case Study โ Research
Questions
Case Study - Research Questions
RQ1: What is the percentage of identifiers
correctly split by the proposed approach?
RQ2: How does the proposed approach perform
compared with the Camel Case splitter?
RQ3: What percentage of identifiers containing
word abbreviations is the approach able to
CSMR 2010, Madrid map to dictionary words?
11/24
12. Case Study โ Results
Case Study - Results
JHotDraw โ Java
16 KLOC
155 files
2,348 identifiers (longer than 2 chars)
957 manually segmented identifiers
Lynx โ C
174 KLOC
247 files
12,194 identifiers (longer than 2 chars)
3,085 manually segmented identifiers
CSMR 2010, Madrid
12/24
13. Case Study โ Results
RQ1 - Percentage of Correct Classifications
Splits Ids Single Multiple Errors
Systems iteration iterations
JHotDraw 957 891 (93%) 920 (95%) 37
Lynx 3,085 2,169 (70%) 2,901 (94%) 271
Typical cases where the approach failed:
afaik, ihmo, foobar, fsize โฆ
CSMR 2010, Madrid
13/24
14. Case Study โ Results
RQ2 - Camel Case Split
Splits Ids Correct Split Errors
Systems
JHotDraw 957 874 (91%) 83
Lynx 3,085 561 (18%) 2,524
Statistical comparison (Fisherโs exact test) with our approach:
Null Hypothesis (H0) : The propotions of correct splittings
obtained by the approaches are not significantly <>.
โข JHotDraw: Odds Ratio = 1.3, p-value = 0.1
CSMR 2010, Madrid โข Lynx: Odds Ratio = 60, p-value < 0.001
14/24
15. Case Study โ Results
RQ3 - Percentage of Correctly Split Id (s)
Splits Ids Correct Split Errors
Systems
JHotDraw 957 920 (95%) 37
Lynx 3,085 2,901 (94%) 271
The novel identifier splitting approach perfoms
better than the Camel Case splitter.
CSMR 2010, Madrid
15/24
16. Case Study โ Results
Multiple Possible Splits - Successes
borddec bord decimal bord decision
anchorlen anchor length anchor lender
drawrect draw rectangle
drawroundrect draw round rectangle
fillrect fill rectangle
javadrawapp java draw apply java draw append
netapp net apply net append
newlen new length new lender
nothingapp nothing apply nothing application
addcolumninfo add column information add column inform
addlbl add label
casecomp case compare case complete
Max of 10000 iterations
CSMR 2010, Madrid
16/24
17. Case Study โ Results
Multiple Possible Splits - Failures
serialversionuid serial version did
selectionzordered selection ordered
removefrfigurerequestremove remove figure request remove
jhotdraw hot draw
getvadjustable get bad just able
fimagewidth him age width
fimageheight him age height
writeref write red
Max of 10000 iterations
DTW does not account for context, syntax or semantic
CSMR 2010, Madrid
17/24
18. Case Study โ Results
Discussion - Challenges
How can we expand fwrite or pdraw?
pdraw?
How can we avoid expanding FileLen into File
Lender rather than File Length?
Length?
How can we recognize that ImagEdit has a correct
split at distance 1 and not 0?
How can we expand/split pqrstuvwxyz?
pqrstuvwxyz?
CSMR 2010, Madrid
18/24
19. Case Study โ Results
Threats to Validity
External validity:
We analyzed only two systems;
However: different domains, different programming languages.
Construct validity: errors may be present in the oracle!
We detected 1% error in the first oracle release;
We did the best to guess programmer intention but we cannot
exclude errors.
Reliability validity: replication package available.
Internal validity: subjectivity and bias in building the oracle:
The same researcher built both oracles;
Oracles were validated by other two researchers;
Size of oracle large enough to avoid a few percent errors change
CSMR 2010, Madrid conclusions.
19/24
20. Conclusion and Future
Work
Conclusion
We presented a search-based approach to
search-
automatically segment source code identifiers.
The novel approach is inspired by the developer
behavior when composing identifiers.
The approach uses a dictionary, a distance computed
via DTW, and a set of word transformations.
Results on JHotDraw and Lynx show the superiority
of the approach over a simple Camel Case splitter.
CSMR 2010, Madrid
20/24
21. Conclusion and Future
Work
Future Work
We plan to:
to:
Expand the evaluation to other systems.
Introduce enhanced heuristics for term selection
and word transformations.
Contextualize our search by coupling our
algorithm with the approach of Enslen et al.
[ELK, 2009](restrict the search to the words used
2009](restrict
CSMR 2010, Madrid
in the same method, class, or package).
21/24
23. References
[ELK, 2009] E. Enslen, E. Hill, L. Pollock, and K. Vijay-Shanker,
โMining source code to automatically split identifiers for software
analysis,โ Mining Software Repositories, International Workshop on,
vol. 0, pp. 71 - 80, 2009.
[H. Ney, 1984] H. Ney, โThe use of a one-stage dynamic programming
algorithm for connected word recognition,โ Acoustics, Speech and
Signal Processing, IEEE Transactions on, vol. 32, no. 2, pp. 263 - 271,
Apr 1984.
D. Lawrie, C. Morrell, H. Feild, and D. Binkley, โEffective identifier
names for comprehension and memory,โ Innovations in Systems and
Software Engineering, vol. 3, no. 4, pp. 303 - 318, 2007.
D. Lawrie, C. Morrel, H. Feild, and D. Binkley, โWhatโs in a name? a
study of identifiers,โ in Proc. of the International Conference on
Program Comprehension (ICPC), 2006, pp. 3 - 12.
CSMR 2010, Madrid
23/24
24. Overall Splitting (Hill Climbing) Procedure
Best Matching Success!
Zero Dist?
Identifier DTW
Match
No
Ranked
Word List No Yes
Improved?
Discard word
and create new
dictionary
Temporary
Dictionary Dictionary
Save word and
create new
dictionary
CSMR 2010, Madrid
24/24