This PhD thesis proposes a method for involving end-users in domain-specific language (DSL) development. The method combines agile and model-driven development approaches. It includes stages for analysis, design, and validation. In the analysis stage, end-users provide requirements through user stories, usage scenarios, and a domain model. The design stage specifies syntax and semantics based on these requirements. Validation tests the DSL with end-users. The goal is to guide DSL development throughout the lifecycle while gathering domain experts' knowledge and feedback.
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...Ali Ouni
This document presents a study on detecting community smells in software projects using a genetic programming-based ensemble classifier chain (GP-ECC) approach. Community smells refer to poor social and organizational practices that can negatively impact a project. The study aims to automatically infer detection rules for different community smell types from examples. It formulates the detection problem as a multi-label learning task and uses GP to evolve detection rules. The approach is evaluated on 103 open source projects and shows 87% precision and 91% recall. Further analysis indicates low variability in detecting different smell types and identifies influential features. The study concludes GP-ECC is promising for community smell detection but more evaluation on diverse projects is needed.
The document presents an approach called Convolutional Analysis of code Metrics Evolution (CAME) that uses a convolutional neural network to detect anti-patterns by analyzing the historical evolution of source code metrics at the class level. An evaluation on 7 open-source systems shows that considering longer histories of metrics improves detection performance and that CAME outperforms other machine learning and anti-pattern detection techniques in terms of precision, recall, and F-measure.
This document summarizes the findings of a tertiary review on code smells and refactoring. It analyzed 40 secondary studies on the topic. Key findings include: 1) Common refactoring topics investigated were opportunities, techniques and tools, while common smell topics were detection, descriptions and support tools. 2) Popular smell detection tools were CCFinder and Jdeodorant, while there were fewer refactoring tools. 3) The relationship between smells and refactoring is complex, as not all smells require refactoring and refactoring can sometimes negatively impact quality. The review identified implications for practitioners, researchers and educators, as well as open issues in the field.
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...Ali Ouni
This document presents a multi-objective search-based software engineering approach to recommend refactorings that introduce design patterns, fix anti-patterns, and improve software quality attributes. The approach uses NSGA-II genetic algorithm to evolve refactoring solutions. An empirical evaluation on 4 open-source Java systems shows the approach outperforms existing techniques in fixing anti-patterns and introducing design patterns while improving quality. Future work includes expanding the types of patterns addressed and developing interactive refactoring support.
This document presents Trustrace, an approach that uses software repository links to improve the trust in automatically recovered traceability links. Trustrace calculates trust values for traceability links based on their similarity scores from information retrieval techniques as well as evidence from other sources like version control commit logs. An empirical study on two systems found that Trustrace improved precision and recall over vector space models and reduced an expert's effort to validate links by up to 50%. The results also tended to improve when using larger version control commit logs.
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
This document summarizes the Search and Hyperlinking Task at the MediaEval 2012 evaluation. It describes the two subtasks of search and linking of videos using automatic speech recognition transcripts and video clues. Participants were evaluated on their ability to search and retrieve relevant video segments based on queries, and to link related video segments for given anchor videos. The task aimed to provide a unified scenario for search and linking using crowdsourcing to assess results relevance. Several research groups participated by submitting runs for the search and linking subtasks.
The Use of Development History in Software Refactoring Using a Multi-Objectiv...Ali Ouni
The document presents a multi-objective approach to automate software refactoring using evolutionary algorithms. It formulates refactoring as a multi-objective optimization problem to improve code quality, preserve semantics, and maximize reuse of past development history. An evaluation on two open source projects shows the approach corrects most defects while maintaining high refactoring precision compared to existing techniques. Future work includes leveraging refactoring histories from multiple systems and improving context-based similarity measures.
Transformers to Learn Hierarchical Contexts in Multiparty DialogueJinho Choi
The document presents an approach using transformers to learn hierarchical contexts in multiparty dialogue. It proposes new pre-training tasks to improve token-level and utterance-level embeddings for handling dialogue contexts. A multi-task learning approach is introduced to fine-tune the language model for a Friends question answering (FriendsQA) task using dialogue evidence, outperforming BERT and RoBERTa. However, the approach shows no improvement on other character mining tasks from Friends. Future work is needed to better represent speakers and inferences in dialogue.
ICGSE2020: On the Detection of Community Smells Using Genetic Programming-bas...Ali Ouni
This document presents a study on detecting community smells in software projects using a genetic programming-based ensemble classifier chain (GP-ECC) approach. Community smells refer to poor social and organizational practices that can negatively impact a project. The study aims to automatically infer detection rules for different community smell types from examples. It formulates the detection problem as a multi-label learning task and uses GP to evolve detection rules. The approach is evaluated on 103 open source projects and shows 87% precision and 91% recall. Further analysis indicates low variability in detecting different smell types and identifies influential features. The study concludes GP-ECC is promising for community smell detection but more evaluation on diverse projects is needed.
The document presents an approach called Convolutional Analysis of code Metrics Evolution (CAME) that uses a convolutional neural network to detect anti-patterns by analyzing the historical evolution of source code metrics at the class level. An evaluation on 7 open-source systems shows that considering longer histories of metrics improves detection performance and that CAME outperforms other machine learning and anti-pattern detection techniques in terms of precision, recall, and F-measure.
This document summarizes the findings of a tertiary review on code smells and refactoring. It analyzed 40 secondary studies on the topic. Key findings include: 1) Common refactoring topics investigated were opportunities, techniques and tools, while common smell topics were detection, descriptions and support tools. 2) Popular smell detection tools were CCFinder and Jdeodorant, while there were fewer refactoring tools. 3) The relationship between smells and refactoring is complex, as not all smells require refactoring and refactoring can sometimes negatively impact quality. The review identified implications for practitioners, researchers and educators, as well as open issues in the field.
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...Ali Ouni
This document presents a multi-objective search-based software engineering approach to recommend refactorings that introduce design patterns, fix anti-patterns, and improve software quality attributes. The approach uses NSGA-II genetic algorithm to evolve refactoring solutions. An empirical evaluation on 4 open-source Java systems shows the approach outperforms existing techniques in fixing anti-patterns and introducing design patterns while improving quality. Future work includes expanding the types of patterns addressed and developing interactive refactoring support.
This document presents Trustrace, an approach that uses software repository links to improve the trust in automatically recovered traceability links. Trustrace calculates trust values for traceability links based on their similarity scores from information retrieval techniques as well as evidence from other sources like version control commit logs. An empirical study on two systems found that Trustrace improved precision and recall over vector space models and reduced an expert's effort to validate links by up to 50%. The results also tended to improve when using larger version control commit logs.
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
This document summarizes the Search and Hyperlinking Task at the MediaEval 2012 evaluation. It describes the two subtasks of search and linking of videos using automatic speech recognition transcripts and video clues. Participants were evaluated on their ability to search and retrieve relevant video segments based on queries, and to link related video segments for given anchor videos. The task aimed to provide a unified scenario for search and linking using crowdsourcing to assess results relevance. Several research groups participated by submitting runs for the search and linking subtasks.
The Use of Development History in Software Refactoring Using a Multi-Objectiv...Ali Ouni
The document presents a multi-objective approach to automate software refactoring using evolutionary algorithms. It formulates refactoring as a multi-objective optimization problem to improve code quality, preserve semantics, and maximize reuse of past development history. An evaluation on two open source projects shows the approach corrects most defects while maintaining high refactoring precision compared to existing techniques. Future work includes leveraging refactoring histories from multiple systems and improving context-based similarity measures.
Transformers to Learn Hierarchical Contexts in Multiparty DialogueJinho Choi
The document presents an approach using transformers to learn hierarchical contexts in multiparty dialogue. It proposes new pre-training tasks to improve token-level and utterance-level embeddings for handling dialogue contexts. A multi-task learning approach is introduced to fine-tune the language model for a Friends question answering (FriendsQA) task using dialogue evidence, outperforming BERT and RoBERTa. However, the approach shows no improvement on other character mining tasks from Friends. Future work is needed to better represent speakers and inferences in dialogue.
A Mono- and Multi-objective Approach for Recommending Software RefactoringAli Ouni
This document outlines Ali Ouni's Ph.D. defense presentation on recommending software refactoring using mono-objective and multi-objective approaches. The presentation includes the following key points:
1. It provides context on the need for automated software refactoring recommendation systems to address challenges in manually refactoring code.
2. It describes Ouni's research methodology which involves detecting code smells, generating refactoring recommendations using mono-objective and multi-objective search-based techniques, and evaluating the approaches.
3. It covers code smell detection including generating detection rules using genetic programming from code smell examples, and evaluating the detection approach on several systems.
4. It outlines the presentation including discussing
130817 latifa guerrouj - context-aware source code vocabulary normalization...Ptidej Team
This document summarizes the contributions of Latifa Guerrouj's PhD thesis on context-aware source code vocabulary normalization. The thesis introduced two context-aware approaches for vocabulary normalization: TIDIER and TRIS. TIDIER is inspired by speech recognition and uses context-aware dictionaries and hill climbing to normalize identifiers. Experiments showed TIDIER outperformed previous approaches and correctly mapped 48% of abbreviations. TRIS treats normalization as an optimization problem to minimize a cost function. Experiments found TRIS had higher accuracy than state-of-the-art approaches like CamelCase and Samurai, with a medium to large effect size on C code.
This document examines cross-project defect prediction, where a model trained on one project is used to predict defects in another project. In an experiment of 622 cross-project combinations across 12 systems, only 3.4% had successful predictions. However, certain project similarities like domain and characteristics like code reviews increased prediction precision and recall. Decision trees were created to help select project combinations where cross-project prediction is more likely to succeed based on characteristics like intended audience and pre-release bug metrics.
Quality in use of domain-specific languages: a case studyAnkica Barisic
The document describes an evaluation of the usability of the Pheasant visual query language for high energy physics compared to object-oriented coding in C++/BEE. Researchers evaluated the effectiveness, efficiency, and confidence of physicists in completing query tasks in both languages. Results showed that Pheasant allowed non-programmers to correctly complete queries more effectively and efficiently compared to C++/BEE. Participants also reported higher confidence in using Pheasant over C++/BEE. The evaluation provides evidence that usability is a key factor in evaluating domain-specific languages.
An Empirical Investigation on Documentation Usage Patterns in Maintenance TasksSebastiano Panichella
When developers perform a software maintenance
task, they need to identify artifacts—e.g., classes or more specifically
methods—that need to be modified. To this aim, they
can browse various kind of artifacts, for example use case
descriptions, UML diagrams, or source code.
This paper reports the results of a study—conducted with 33
participants— aimed at investigating (i) to what extent developers
use different kinds of documentation when identifying artifacts
to be changed, and (ii) whether they follow specific navigation
patterns among different kinds of artifacts.
Results indicate that, although developers spent a conspicuous
proportion of the available time by focusing on source code,
they browse back and forth between source code and either
static (class) or dynamic (sequence) diagrams. Less frequently,
developers—especially more experienced ones—follow an “integrated”
approach by using different kinds of artifacts.
The document discusses social aspects of ecological statistical software development. It outlines that software development exists within a community context and how communities can work better for individuals and the community overall. It notes challenges with R including some language design issues and performance, as well as potential community problems. It also depicts the vast world of R packages and discusses package scope and communities.
This document provides a 3-sentence summary of the Certified Tester Foundation Level Syllabus document:
The syllabus outlines the key concepts and topics covered in foundation level certification for software testing, including testing techniques, test management, and quality assurance. It provides the copyright information and history of revisions to the certification syllabus. The International Software Testing Qualifications Board maintains and updates the syllabus.
Graphical vs. textual representations were compared in a requirements comprehension study. Subjects (N=28 students) viewed requirements documents presented graphically, textually, or with both. Results showed no significant difference in comprehension accuracy between representations. However, graphical representations required significantly more visual effort as measured by eye movements. Subjects also preferred graphical representations but found them more difficult. The document structure influenced whether subjects adopted a top-down or bottom-up problem-solving strategy.
Using Interactive Genetic Algorithm for Requirements PrioritizationFrancis Palma
This document describes using a genetic algorithm to prioritize requirements. It begins with an outline and introduction to the problem of prioritizing software requirements. It then discusses related prioritization techniques from literature and how they have limitations like poor scalability. The document proposes using a genetic algorithm to prioritize requirements, leveraging domain knowledge graphs representing priority and dependencies. It describes representing potential solutions as individuals in a population, calculating fitness by counting disagreements with domain knowledge, and using genetic operators like crossover to evolve better solutions over generations. The goal is to find a prioritized requirements list that best satisfies constraints and delivers value to users.
Web Service Antipatterns Detection Using Genetic ProgrammingAli Ouni
The document describes a study that uses genetic programming to automatically detect antipatterns in web services. It presents an approach that infers detection rules from examples of antipattern instances. The approach was evaluated on 310 real-world web services and showed promising results, detecting antipatterns with 85% precision and 87% recall on average. The study demonstrates that genetic programming can effectively generate rules for detecting common antipatterns like multi-service, nanoservice, and chatty service. Future work could expand the approach to more antipattern types and service metrics.
Using Interactive GA for Requirements PrioritizationFrancis Palma
The document discusses using an interactive genetic algorithm (IGA) approach for requirements prioritization. The IGA aims to minimize disagreement between a total order of prioritized requirements and various constraints, such as those encoded with the requirements or expressed by users. It considers user knowledge, minimizes user requests, and ensures robustness to errors. The IGA process involves acquiring requirements and domain knowledge, running an interactive genetic algorithm to compute solutions, and outputting the ranked requirements. It is demonstrated on a real-world case study where IGA produced improved prioritizations with reduced user effort compared to other approaches.
PhD Maintainability of transformations in evolving MDE ecosystemsJokin García Pérez
- Co-evolve transformations to metamodel evolution
- Adapter-based approach to co-evolve generated SQL in model to text transformations
- Testing model to text transformations
-
Who Should Review My Code? A file-location based code-reviewer recommendation approach for modern code review.
This research study is presented at the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER2015)
Find more information and preprint at patanamon.com
Cross-project defect prediction is very appealing because (i) it allows predicting defects in projects for which the availability of data is limited, and (ii) it allows producing generalizable prediction models. However, existing research suggests that cross-project prediction is particularly challenging and, due to heterogeneity of projects, prediction accuracy is not always very good. This paper proposes a novel, multi-objective approach for cross-project defect prediction, based on a multi-objective logistic regression model built using a genetic algorithm. Instead of providing the software engineer with a single predictive model, the multi-objective approach allows software engineers to choose predictors achieving a compromise between number of likely defect-prone artifacts (effectiveness) and LOC to be analyzed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the Promise repository indicate the superiority and the usefulness of the multi-objective approach with respect to single-objective predictors. Also, the proposed approach outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.
This document discusses defect prediction models in software development. It begins by covering the importance of effort estimation in software maintenance planning and management. The document then discusses how data from software defect reports, including details on defects, components, testers and fixes, can be used to build reliability models to predict remaining defects. Machine learning and data mining techniques are proposed to analyze relationships between software quality across releases and to construct predictive models for forecasting time to fix defects. The document provides an overview of typical software development processes and then discusses a two-step approach to defect prediction and analysis using appropriate statistics and data mining techniques.
Deep learning based code smell detection - Qualifying TalkSayed Mohsin Reza
Presented by; Sayed Mohsin Reza, Ph.D. Student, Computer Science, University of Texas
Abstract:
Code smells are structures in the source code that suggest the possibility of refactorings. Consequently, developers may identify refactoring opportunities by detecting code smells. However, manual identification of code smells is challenging and tedious. To this end, a number of approaches have been proposed to identify code smells automatically or semi-automatically. Most of such approaches rely on manually designed heuristics to map manually selected source code metrics into predictions. However, it is challenging to manually select the best features. It is also difficult to manually construct the optimal heuristics. To this end, in this paper we propose a deep learning based novel approach to detecting code smells. The key insight is that deep neural networks and advanced deep learning techniques could automatically select features of source code for code smell detection, and could automatically build the complex mapping between such features and predictions. A big challenge for deep learning based smell detection is that deep learning often requires a large number of labeled training data (to tune a large number of parameters within the employed deep neural network) whereas existing datasets for code smell detection are rather small. To this end, we propose an automatic approach to generating labeled training data for the neural network based classifier, which does not require any human intervention. As an initial try, we apply the proposed approach to four common and well-known code smells, i.e., feature envy, long method, large class, and misplaced class. Evaluation results on open-source applications suggest that the proposed approach significantly improves the state-of-the-art.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
This document summarizes the potential for using genetic engineering to improve malt quality by modifying barley components. It discusses how genetic engineering technologies have advanced to allow transformation of barley. Some key areas that could potentially be improved through genetic engineering include decreasing beta-glucan levels in cell walls, increasing heat stability or activity of beta-glucanase enzymes, modifying hordein proteins to reduce processing problems, and altering starch properties to improve modification. However, many limitations and challenges remain due to insufficient understanding of the complex biochemical pathways and interactions involved.
The document provides an overview of Paper 1 of the DSE English exam. It includes:
- An introduction to Paper 1, which consists of two parts worth 10% each and tests reading comprehension. Candidates choose between the easier Part B1 or more difficult Part B2.
- Suggested time allocations of 40 minutes for Part A and 50 minutes for Part B.
- An overview of basic answering techniques for multiple choice and short answer questions, including predicting, skimming, and scanning.
- Examples of common multiple choice question types like determining tone and style.
The document aims to inform students about the structure and requirements of the Paper 1 exam and provide test-taking strategies.
A Mono- and Multi-objective Approach for Recommending Software RefactoringAli Ouni
This document outlines Ali Ouni's Ph.D. defense presentation on recommending software refactoring using mono-objective and multi-objective approaches. The presentation includes the following key points:
1. It provides context on the need for automated software refactoring recommendation systems to address challenges in manually refactoring code.
2. It describes Ouni's research methodology which involves detecting code smells, generating refactoring recommendations using mono-objective and multi-objective search-based techniques, and evaluating the approaches.
3. It covers code smell detection including generating detection rules using genetic programming from code smell examples, and evaluating the detection approach on several systems.
4. It outlines the presentation including discussing
130817 latifa guerrouj - context-aware source code vocabulary normalization...Ptidej Team
This document summarizes the contributions of Latifa Guerrouj's PhD thesis on context-aware source code vocabulary normalization. The thesis introduced two context-aware approaches for vocabulary normalization: TIDIER and TRIS. TIDIER is inspired by speech recognition and uses context-aware dictionaries and hill climbing to normalize identifiers. Experiments showed TIDIER outperformed previous approaches and correctly mapped 48% of abbreviations. TRIS treats normalization as an optimization problem to minimize a cost function. Experiments found TRIS had higher accuracy than state-of-the-art approaches like CamelCase and Samurai, with a medium to large effect size on C code.
This document examines cross-project defect prediction, where a model trained on one project is used to predict defects in another project. In an experiment of 622 cross-project combinations across 12 systems, only 3.4% had successful predictions. However, certain project similarities like domain and characteristics like code reviews increased prediction precision and recall. Decision trees were created to help select project combinations where cross-project prediction is more likely to succeed based on characteristics like intended audience and pre-release bug metrics.
Quality in use of domain-specific languages: a case studyAnkica Barisic
The document describes an evaluation of the usability of the Pheasant visual query language for high energy physics compared to object-oriented coding in C++/BEE. Researchers evaluated the effectiveness, efficiency, and confidence of physicists in completing query tasks in both languages. Results showed that Pheasant allowed non-programmers to correctly complete queries more effectively and efficiently compared to C++/BEE. Participants also reported higher confidence in using Pheasant over C++/BEE. The evaluation provides evidence that usability is a key factor in evaluating domain-specific languages.
An Empirical Investigation on Documentation Usage Patterns in Maintenance TasksSebastiano Panichella
When developers perform a software maintenance
task, they need to identify artifacts—e.g., classes or more specifically
methods—that need to be modified. To this aim, they
can browse various kind of artifacts, for example use case
descriptions, UML diagrams, or source code.
This paper reports the results of a study—conducted with 33
participants— aimed at investigating (i) to what extent developers
use different kinds of documentation when identifying artifacts
to be changed, and (ii) whether they follow specific navigation
patterns among different kinds of artifacts.
Results indicate that, although developers spent a conspicuous
proportion of the available time by focusing on source code,
they browse back and forth between source code and either
static (class) or dynamic (sequence) diagrams. Less frequently,
developers—especially more experienced ones—follow an “integrated”
approach by using different kinds of artifacts.
The document discusses social aspects of ecological statistical software development. It outlines that software development exists within a community context and how communities can work better for individuals and the community overall. It notes challenges with R including some language design issues and performance, as well as potential community problems. It also depicts the vast world of R packages and discusses package scope and communities.
This document provides a 3-sentence summary of the Certified Tester Foundation Level Syllabus document:
The syllabus outlines the key concepts and topics covered in foundation level certification for software testing, including testing techniques, test management, and quality assurance. It provides the copyright information and history of revisions to the certification syllabus. The International Software Testing Qualifications Board maintains and updates the syllabus.
Graphical vs. textual representations were compared in a requirements comprehension study. Subjects (N=28 students) viewed requirements documents presented graphically, textually, or with both. Results showed no significant difference in comprehension accuracy between representations. However, graphical representations required significantly more visual effort as measured by eye movements. Subjects also preferred graphical representations but found them more difficult. The document structure influenced whether subjects adopted a top-down or bottom-up problem-solving strategy.
Using Interactive Genetic Algorithm for Requirements PrioritizationFrancis Palma
This document describes using a genetic algorithm to prioritize requirements. It begins with an outline and introduction to the problem of prioritizing software requirements. It then discusses related prioritization techniques from literature and how they have limitations like poor scalability. The document proposes using a genetic algorithm to prioritize requirements, leveraging domain knowledge graphs representing priority and dependencies. It describes representing potential solutions as individuals in a population, calculating fitness by counting disagreements with domain knowledge, and using genetic operators like crossover to evolve better solutions over generations. The goal is to find a prioritized requirements list that best satisfies constraints and delivers value to users.
Web Service Antipatterns Detection Using Genetic ProgrammingAli Ouni
The document describes a study that uses genetic programming to automatically detect antipatterns in web services. It presents an approach that infers detection rules from examples of antipattern instances. The approach was evaluated on 310 real-world web services and showed promising results, detecting antipatterns with 85% precision and 87% recall on average. The study demonstrates that genetic programming can effectively generate rules for detecting common antipatterns like multi-service, nanoservice, and chatty service. Future work could expand the approach to more antipattern types and service metrics.
Using Interactive GA for Requirements PrioritizationFrancis Palma
The document discusses using an interactive genetic algorithm (IGA) approach for requirements prioritization. The IGA aims to minimize disagreement between a total order of prioritized requirements and various constraints, such as those encoded with the requirements or expressed by users. It considers user knowledge, minimizes user requests, and ensures robustness to errors. The IGA process involves acquiring requirements and domain knowledge, running an interactive genetic algorithm to compute solutions, and outputting the ranked requirements. It is demonstrated on a real-world case study where IGA produced improved prioritizations with reduced user effort compared to other approaches.
PhD Maintainability of transformations in evolving MDE ecosystemsJokin García Pérez
- Co-evolve transformations to metamodel evolution
- Adapter-based approach to co-evolve generated SQL in model to text transformations
- Testing model to text transformations
-
Who Should Review My Code? A file-location based code-reviewer recommendation approach for modern code review.
This research study is presented at the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER2015)
Find more information and preprint at patanamon.com
Cross-project defect prediction is very appealing because (i) it allows predicting defects in projects for which the availability of data is limited, and (ii) it allows producing generalizable prediction models. However, existing research suggests that cross-project prediction is particularly challenging and, due to heterogeneity of projects, prediction accuracy is not always very good. This paper proposes a novel, multi-objective approach for cross-project defect prediction, based on a multi-objective logistic regression model built using a genetic algorithm. Instead of providing the software engineer with a single predictive model, the multi-objective approach allows software engineers to choose predictors achieving a compromise between number of likely defect-prone artifacts (effectiveness) and LOC to be analyzed/tested (which can be considered as a proxy of the cost of code inspection). Results of an empirical evaluation on 10 datasets from the Promise repository indicate the superiority and the usefulness of the multi-objective approach with respect to single-objective predictors. Also, the proposed approach outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes.
This document discusses defect prediction models in software development. It begins by covering the importance of effort estimation in software maintenance planning and management. The document then discusses how data from software defect reports, including details on defects, components, testers and fixes, can be used to build reliability models to predict remaining defects. Machine learning and data mining techniques are proposed to analyze relationships between software quality across releases and to construct predictive models for forecasting time to fix defects. The document provides an overview of typical software development processes and then discusses a two-step approach to defect prediction and analysis using appropriate statistics and data mining techniques.
Deep learning based code smell detection - Qualifying TalkSayed Mohsin Reza
Presented by; Sayed Mohsin Reza, Ph.D. Student, Computer Science, University of Texas
Abstract:
Code smells are structures in the source code that suggest the possibility of refactorings. Consequently, developers may identify refactoring opportunities by detecting code smells. However, manual identification of code smells is challenging and tedious. To this end, a number of approaches have been proposed to identify code smells automatically or semi-automatically. Most of such approaches rely on manually designed heuristics to map manually selected source code metrics into predictions. However, it is challenging to manually select the best features. It is also difficult to manually construct the optimal heuristics. To this end, in this paper we propose a deep learning based novel approach to detecting code smells. The key insight is that deep neural networks and advanced deep learning techniques could automatically select features of source code for code smell detection, and could automatically build the complex mapping between such features and predictions. A big challenge for deep learning based smell detection is that deep learning often requires a large number of labeled training data (to tune a large number of parameters within the employed deep neural network) whereas existing datasets for code smell detection are rather small. To this end, we propose an automatic approach to generating labeled training data for the neural network based classifier, which does not require any human intervention. As an initial try, we apply the proposed approach to four common and well-known code smells, i.e., feature envy, long method, large class, and misplaced class. Evaluation results on open-source applications suggest that the proposed approach significantly improves the state-of-the-art.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
This document summarizes the potential for using genetic engineering to improve malt quality by modifying barley components. It discusses how genetic engineering technologies have advanced to allow transformation of barley. Some key areas that could potentially be improved through genetic engineering include decreasing beta-glucan levels in cell walls, increasing heat stability or activity of beta-glucanase enzymes, modifying hordein proteins to reduce processing problems, and altering starch properties to improve modification. However, many limitations and challenges remain due to insufficient understanding of the complex biochemical pathways and interactions involved.
The document provides an overview of Paper 1 of the DSE English exam. It includes:
- An introduction to Paper 1, which consists of two parts worth 10% each and tests reading comprehension. Candidates choose between the easier Part B1 or more difficult Part B2.
- Suggested time allocations of 40 minutes for Part A and 50 minutes for Part B.
- An overview of basic answering techniques for multiple choice and short answer questions, including predicting, skimming, and scanning.
- Examples of common multiple choice question types like determining tone and style.
The document aims to inform students about the structure and requirements of the Paper 1 exam and provide test-taking strategies.
This document provides instructions for home brewing beer using basic mashing and sparging techniques. It describes a double bucket mashing system and outlines the mashing procedure. This includes heating the mash water, adding grain to create the mash, and maintaining the mash temperature for 90-120 minutes. It then describes heating sparge water and two common sparging methods - fly sparging and batch sparging. The document also covers boiling the wort, cooling it, and pitching yeast to start fermentation.
SreeRam Bommi is a senior level executive with over 15 years of experience in management, accounting, and financial operations. He has extensive expertise in SAP modules, MS Office, and Tally ERP9.0. His experience includes roles as a senior accountant executive at Indecomm Global Services where he managed accounts receivable functions and as a senior auditor manager at Narshima Rao & Chartered Associates where he examined accounting records and prepared financial reports. He holds a Master of Accounting degree from Central Queensland University and a Master of Business Administration from Osmania University.
Sample worksheet and enlarged notes to Instructors and Assistants from before 2007, (c) Kumon North America. The most up-to-date worksheets can be viewed at their website, www.kumon.com, or at one of their Centers.
1. The document summarizes the top 10 news stories from Hong Kong in 2013. Some of the major stories included political reform consultations starting in Hong Kong, Edward Snowden's revelations about US government surveillance programs, a ban on exporting baby formula from Hong Kong to address shortages, and a court ruling related to welfare benefits for immigrants.
2. There was public debate and opposition related to many of these stories, such as protests against the government's decision to deny a TV license to a new broadcaster and legal challenges to the court ruling on welfare benefits.
3. Many of the stories highlighted tensions between the Hong Kong government and citizens on issues of democracy, individual rights, and social welfare.
Premnath K is an electrical engineer with over 7 years of experience in the explosion proof and oil & gas industries. He is currently working as the technical in-charge for control panels and sales & business development at Specialized Power Products FZC in Dubai. Previously, he worked as a marketing engineer at Baliga Lighting Equipments and as an electrical engineer supporting projects at Chemtrols Samil India. He has expertise in hazardous area classifications, junction box and control panel selection/design, and cable/lighting product selection. He is seeking a challenging role in project management, engineering, or sales & business development utilizing his technical skills and experience with EPC contractors and end users.
Este documento proporciona numerosas frases en inglés que los profesores pueden usar en el aula para diferentes propósitos como comenzar y terminar la clase, organizar actividades, ofrecer instrucciones, evaluar el trabajo de los estudiantes y mantener la disciplina. El documento está organizado por temas como saludos, tomar asistencia, comenzar y terminar tareas, usar tecnología y pedir y dar ayuda.
This document discusses various topics related to accommodation, including:
- Different types of accommodation such as halls of residence, student digs, apartments, terraced houses, and mobile homes.
- Factors to consider when choosing accommodation such as location, living situation, and costs.
- Advantages and disadvantages of renting versus owning a home.
- Challenges young couples face in finding affordable accommodation.
- Pleasures of making a home such as house hunting, renovating an old property, and adding personal touches.
This document provides information on key analytical methods for measuring various quality parameters in beer according to the MEBAK and ASBC standards. It discusses methods for measuring beer color, bitterness units, iso-α- and β-acids, free amino nitrogen, total polyphenols, vicinal diketones, and reducibility. For each parameter, it describes the analytical method, calculation, and typical values according to the MEBAK or ASBC standard. It also lists the available measurement programs for each parameter on the Hach DR6000 UV-VIS spectrophotometer to allow testing according to the different standards.
This document discusses a study on increasing the ratio of rice to barley that can be used in brewing beer. It found that rice naturally contains lower levels of soluble protein compared to barley, limiting how much rice can be used. The study tested adding the protease enzyme Neutrase and modifying mashing methods to increase soluble nitrogen levels in wort made from rice. It determined that mashing rice for 60 minutes and adding 400 μl of Neutrase per 50g of rice was optimal. Activating another enzyme also increased solubilized protein from rice. Various rice-barley mixtures were then mashed and fermented, finding worts with sufficient nitrogen levels could contain up to 80% rice. Therefore, enzyme addition and mashing
The institute of brewing research schemeVohinh Ngo
This document discusses a method for estimating the "true protein nitrogen" fraction in worts and beers. It involves heating the sample with sodium nitrite and acetic acid to deaminate nitrogen compounds, followed by cooling when the true protein fraction precipitates out. The method is evaluated based on previous literature which showed wort contains protein fractions of varying molecular sizes and compositions. Two main protein fractions are discussed - globulin, which is not readily coagulated by boiling, and albumin, which is coagulated by heating below boiling. The document aims to precisely define and estimate the true protein fraction in worts and beers.
Modern elicitation trends asma & ayesha paper presentationAsma Sajid
The document discusses modern trends in requirement elicitation techniques. It outlines traditional elicitation methods like conversational, observational, analytical and synthetic. It then analyzes the effect of elicitation on projects, the most commonly used methods, and methods for specific project types. Finally, it proposes guidelines for elicitation and a plan combining different techniques based on the project characteristics and development process.
The document summarizes Vlad Acrețoaie's PhD thesis on developing model manipulation languages for end-user modelers. It introduces the Visual Model Query Language (VMQL), Visual Model Constraint Language (VMCL), and Visual Model Transformation Language (VMTL) designed to be more usable for non-programmers. Experiments showed VMQL and VMTL had good learnability and usability. Future work includes improving tool support, more qualitative evaluations, and building a theory of learnability for model manipulation languages. The thesis takes initial steps to develop highly usable languages for end-user modelers.
Ch 6 only 1. Distinguish between a purpose statement, research pMaximaSheffield592
Ch 6 only
1. Distinguish between a purpose statement, research problem, and research questions.
2. What are major ideas that should be included in a qualitative purpose statement?
3. What are the major components of a quantitative purpose statement?
4. What are the major components of a mixed methods purpose statement?
Requirements Engineering (20 points)
In Chapter 4 of Software Engineering. Sommerville, Pearson, 2016 (10th edition), Sommerville discusses ethnography as a method for eliciting requirements.
1. Discuss two advantages and two disadvantages of an ethnographic approach. (5 points)
2. Suggest two contexts where ethnography might be a challenging method of requirements engineering. For each context, how would you recommend that your team elicit requirements? (15 points)
Design (20 points)Design patterns (5 points)
Which of the following statements is (are) true? Explain.
1. StudentsDatabase is the model, StudentsManager is the controller, and WebApplication is the view.
2. StudentsDatabase is the model, StudentsManager is the view, and WebApplication is the controller.
3. StudentsManager is the model, StudentsDatabase is the view, and StudentsManager is the controller.
4. This is not MVC, because StudentsManager must use a listener to be notified when the database changes.
(Credit: EPFL)Design task (15 points)
Suppose you are asked to design a time management and notetaking system to support (1) scheduling meetings; and (2) tracking the documents associated with those meetings (e.g. agendas, presentations, meeting minutes).[footnoteRef:1] The system should accommodate [1: Such a feature seems like an inevitable development in any messaging platform…]
Make reasonable assumptions as needed.
1. Create a use case for “Schedule meeting”. You might follow the style in Sommerville Figure 7.3. (5 points)
2. Identify the objects in your system. Represent them using a structural diagram showing the associations between objects (“Class diagram” – cf. Sommerville Figure 5.9). (5 points)
3. Draw a sequence diagram showing the interactions between objects when a group of people are arranging a meeting (cf. Sommerville Figure 5.15). (5 points)
1. Implementation (20 points)
Consider the software package is-positive.[footnoteRef:2] Examine its source code (see index.js) and its test suite (see test.js), then complete these questions. [2: https://www.npmjs.com/package/is-positive]
1. Describe the API surface of this package. (2 points)
2. Describe how you would test this package. Describe how and why your approach would change if you maintained a similar package in a different programming language of your choice. (2 points)
3. According to npmjs.com, this package receives over 16,000 downloads each month.
a. Why might an engineer choose to use this package? (4 points)
b. Why might an engineer choose not to use this package? (You may find insights from the chapter ab ...
Ch 6 only 1. distinguish between a purpose statement, research pnand15
This document provides guidance and examples for developing different components of a research proposal or study across qualitative, quantitative, and mixed methods approaches. It discusses key elements such as developing a purpose statement, research questions and hypotheses, reviewing literature, using theory, and addressing ethical considerations. Examples are provided for different types of qualitative studies, quantitative studies using surveys and experiments, and mixed methods studies with convergent, explanatory sequential and exploratory sequential designs. Guidance is also given on writing strategies, developing introductions, and structuring different sections of a research proposal.
This PhD thesis proposes an agile, model-driven method for involving end-users in domain-specific language (DSL) development. The method was used to develop a DSL for genetic analysis pipelines by collaborating with geneticists from three organizations. An empirical experiment validated that end-users and developers were satisfied with the method. The main contributions are the design and implementation of an innovative agile, model-driven method for involving end-users in DSL development, and validating the approach in an industrial setting with a real DSL development.
To take maximum advantage of open source software (OSS), the understanding, management and mitigation of OSS adoption risks is crucial. We describe the empirical application of the tactical workshops with the purpose of obtaining the domain expert evaluation.
This document discusses the Dynamic Reconfigurability in Embedded System Design (DRESD) project. It provides an overview of the DRESD philosophy, team structure and partnerships. It also describes some of the key areas of research within the project, including reconfiguration principles, the Earendil design flow, and example projects exploring reconfigurable hardware and simulation frameworks.
This document discusses various system development methodologies and automated tools. It describes methodology approaches like waterfall, parallel development, rapid application development including phased development, prototyping, and agile development using extreme programming. Key criteria for selecting a methodology include the clarity of requirements, technology familiarity, complexity, reliability, schedules, and visibility. The document also outlines computer-aided software engineering tools that automate and standardize development processes, improving quality, documentation, and project management while simplifying maintenance.
Agile Manifesto and Practices Selection for Tailoring Software DevelopmentManuel Kolp
Agile Manifesto and Practices Selection for Tailoring Software Development: a Systematic Literature Review, PROFES 2018, 19th Int. Conf. on Product-Focused Software Process Improvement, Nov. 28 – 30, Wolfsburg, Germany
Soreangsey Kiv, Samedi Heng, Manuel Kolp and Yves Wautelet
A Study on MDE Approaches for Engineering Wireless Sensor Networks Ivano Malavolta
27th August 2014. My presentation at SEAA 2014 (http://esd.scienze.univr.it/dsd-seaa-2014) about our a study on model-driven engineering approaches for engineering Wireless Sensor Networks (WSNs).
Accompanying paper: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6928805
Abstract:
Model-Driven Engineering (MDE) can be considered as the right tool to reduce the complexity of Wireless Sensor Network (WSN) development through its principles of abstraction, separation of concerns, reuse and automation. In this paper we present the results of a systematic mapping study we performed for providing an organized view of existing MDE approaches for designing WSNs.
A total number of 780 studies were analysed; among them, we selected 16 papers as primary studies relevant for review. We setup a comparison framework for these studies, and classified them based on a set of common parameters. The main objective of our research is to give an overview about the state-of-the-art of MDE approaches dedicated to WSN design, and finally, discuss emerging challenges that have to be considered in future MDE approaches for engineering WSNs.
Survey Based Reviewof Elicitation ProblemsIJERA Editor
Any software development process is the combination of multiple development activities and each activity has a
vital role in the software development cycle. Requirement Engineering is the main and basic branch of Software
Engineering, it has many phases but the most initial phase is Requirement Elicitation. In this phase requirements
are gathered for system development.
This paper provides a literature review of the requirements engineering processes performed in traditional and
modern development processes and analyses the problems in the requirements elicitation phase. This problem
analysis is based on a survey which was conducted in University. A questionnaire posing questions regarding
the problems in requirement elicitation was given to final year computer science graduate students who are
working on their final year project as a requirement for their degree. The theoretical analysis of the
questionnaire further clarifies the problems. This problems analysis will help to find out the main problems
which are faced by the perspective software developers
Designing A Waterfall Approach For Software Development EssayAlison Reed
Thomas Hardy's poem "Under the Waterfall" describes two lovers having a picnic in August. The rushing water of the waterfall evokes a memory or voice from the past. Nature holds power over the lovers and their relationship. The poem can be interpreted in many ways regarding the influence of nature and memories of the past.
This document summarizes a master's thesis that presents a solution for scanning sequences of HTTP requests in the open source penetration testing tool ZAP (Zed Attack Proxy). The thesis documents the analysis, design, and implementation phases of adding multi-step scanning functionality to ZAP. It also explains how different test scenarios were used to verify the functionality. The proposed solution serves as a proof-of-concept that could later be integrated into the publicly available version of ZAP.
Uk Research Infrastructure Workshop E-infrastructure Juan BicarreguiInnovate UK
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
How to build a successful EU project
by Juan Bicarregui
Scientific Computing Department STFC
The document discusses software project management. It defines what a project and project management are, and describes the key characteristics of a software project. It outlines several software development lifecycles and methodologies including waterfall, prototype, spiral, agile, Scrum, extreme programming (XP), and rapid application development (RAD). It also discusses software project roles, risk management, project monitoring, defining a lifecycle model, software team organization structures, communication and coordination practices, and factors to consider when selecting a lifecycle model.
This document provides a 3 paragraph summary of a software engineering course titled "Software Engineering (KCS-601)" taught by Dr. Radhey Shyam at SRMCEM Lucknow. The course contents were compiled by Dr. Shyam and are available for students' academic use. Students can contact Dr. Shyam via email for any queries regarding the course material.
This document provides an overview of software development life cycle (SDLC) models and their comparison. It discusses several SDLC models including waterfall, V-shaped, iterative, prototyping, RAD, spiral and agile. Each model is described in terms of its phases, advantages and disadvantages. The document also presents related work from other scholars and states that while agile was not fully extreme programming, using Scrum principles resulted in return on investment and lower costs. It proposes future work to identify knowledge sharing procedures and user-centered SDLC models that overcome limitations of existing approaches.
Plataforma web y metodología para el desarrollo de sistemas sensibles al cont...damarcant
This document describes a web platform and methodology for developing context-aware systems through collaboration between programmers and domain experts. It proposes a Situation-Driven Development methodology with 5 stages to guide the collaborative process. It also presents the Context Cloud platform, which was designed based on literature requirements to support automatic context management, reasoning, location detection, end-user development and visualization. An evaluation showed the methodology and platform facilitated involvement of domain experts and allowed situations to be configured without programmer intervention. The contributions enable context-aware system development by both technical and non-technical users.
This document summarizes a thesis on automating test routine creation through natural language processing. The author proposes using word embeddings and recommender systems to automatically generate test cases from requirements documents and link them together. The methodology involves representing text as word vectors, calculating similarity between requirements and test blocks, and applying association rule mining on test block sequences. An experiment on a space operations dataset showed the approach improved productivity in test creation and requirements tracing over manual methods. Future work could explore using deep learning models and collecting additional evaluation metrics from users.
1. Centro de Investigación ProS
An Agile Model-Driven Method for Involving
End-Users in DSL Development
MªJosé Villanueva del Pozo
PhD Thesis, 8th of January of 2016
Advisors:
Dr. Óscar Pastor López
Dr. Francisco Valverde Giromé
2. Index
2
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demonstration
7. Conclusions and Future Work
5. Motivation
5
“Software languages that target small domains and whose
language constructs are formed by domain concepts”
Villanueva, 2016
Are a solution for improving understanding in
software development
“Small languages that offer expressive power focused on a
particular problem domain”
Van Deursen, 2000
8. Motivation
Industrial motivation from
“We have challenges to analyse genetic data”
“We require to use state-of-the-art analytic tools”
“We need a tool highly customizable to each diagnosis”
8 1Instituto de Medicina Genómica. www.imegen.es
A unique tool is an unsustainable solution. They need an
infrastructure to continuously evolve their genetic analysis
A DSL for specifying genetic analysis
We are not experts neither genetics nor bioinformatics
1
9. Motivation
Academic motivation
“We want to develop a DSL for supporting genetic analysis”
“We don’t have enough knowledge about genetics”
“The collaboration of geneticists is essential”
“Geneticists don’t have enough development knowledge”
9
We require to involve geneticists in the DSL development process
Follow a DSL development method to involve end-users
Current approaches do not take end-users into account
10. Index
10
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demonstration
7. Conclusions and Future Work
11. PhD Goals
Propose a DSL development approach to involve end-users
1. To support complex application domains
2. A DSL for supporting genetic analysis
11
1
+
2
12. PhD Goals: Research Questions
RQ1. Is it essential to involve end-users in the development of
a DSL for a complex application domain?
→ Analyse a complex application domain and Illustrate the need
to involve end-users in DSL development
RQ2. Which are the available approaches to involve end-users
in DSL development?
→ Analyse state-of-the-art DSL development approaches that
involve end-users
12
13. PhD Goals: Research Questions
RQ3. How can we provide a methodological approach to
involve end-users in DSL development?
→ Propose a new method to involve end-users in DSL
development
RQ4. How can we validate that the solution proposed is a
suitable solution to involve end-users in DSL development?
→ Validate the proposed method and apply it with geneticists
13
14. Index
14
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demo
7. Conclusions and Future Work
15. State of the art
1. Foundations DSL development: Methodologies, guidelines, and
best practices
• Van Deursen et al. (2000): Terminology
• Spinellis (2001): Design patterns for DSL development
• Mernik et al.(2005): Stages for DSL development
• Voelter et al. (2008): Conceptual foundations, design and
implementation of DSLs
• Strembeck et al.(2009): systematic approach for guiding DSL developers
15
16. State of the art
2. DSL development approaches that take end-users into
account:
• Take into account end-user preferences during
development
• Apply an agile process to gather early feedback from end-
users
• Involve end-users in development activities
16
17. State of the art
1. Perez et al. (2011): Best practices from EUD
2. Nishino (2011): Cognitive dimensions and feature heuristics
3. Barisik et al. (2012): Goal-question-metric approach
4. Wuest et al. (2013): Sketching environment
5. Cho et al. (2012): Sketches, shape selection, and questions
6. Kuhrman et al. (2013): A DSL that uses sketches and views
7. Sanchez-Cuadrado et al. (2012): Sketches
8. Canovas et al. (2013): A collaborative infrastructure
17
18. State of the art
Analysis criteria:
Methodological support: All stages of DSL development
lifecycle are addressed
End-user involvement: Whether end-users are involved in
DSL development tasks and whether best practices from EUD
domain are applied
18
19. State of the art
19
Stage Activity Criteria 1 2 3 4 5 6 7 8
Analysis
Domain Analysis
Support S x x S S S S x
EU Inv x x x S S S S x
Domain Model Specification
Support PS x x x x x x x
EU Inv x x x x x x x x
Design
Abstract Syntax Specification
Support S x x S S S S S
EU Inv x x x S S x x S
Concrete Syntax Specification
Support S x x S S S x S
EU Inv x x x S S S x S
Semantic Restrictions
Specification
Support x x x PS S S x x
EU Inv x x x x S x x x
Behavioral Semantics
Specification
Support S x x x x x x x
EU Inv x x x x x x x x
Testing DSL infrastructure testing
Support x PS PS x x x x x
EU Inv x x PS x x x x x
Maintenance New requirements addition
Support x x x x x x x x
EU Inv x x x x x x x x
S: Supported PS: Partially Supported X: Not supported
20. State of the Art
Need for a proposal to fulfil the following requirements:
• Requirement 1: Guidance throughout the complete DSL
development life-cycle.
• Requirement 2: Feasibility of the DSL development time.
• Requirement 3: Gathering domain experts’ knowledge in
all the stages in which they can collaborate.
20
21. Index
21
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demonstration
7. Conclusions and Future Work
22. Method: Foundations
An agile model-driven method for involving end-users
22
Proposed in
Mernik et al. (2005)
23. Method: Foundations
Combination of MDD and Agile practices:
• Efficiency to the process
• Interface for end-users to provide feedback about certain
DSL artefacts
• Propagation of end-users’ feedback along the different DSL
artefacts
23
24. Method: Foundations
Combination of MDD and Agile practices
24
Conceptual modelling Model transformations
Iterative
development
User stories
TDD
Scenarios
Product Backlog
Architectural
Envisioning Acceptance
Tests
26. Method: Illustrative Example
Diagnose Diabetes Mellitus Type 2 (Analysis 1)
Read Variations genotypes from VCF file Patient1.vcf
Annotate Variations with gene, transcripts, polyphen
Filter Variations by genes {ABCC8, CAPN10, KCNJ11, … ,
GPD2, MNTR1B}
Filter Variations by predicted effect polyphen damaging
Report Variations with gene, predicted_effect
26
27. Method: The Analysis Stage
27
A
2.1 Iteration
Planning
Understand the domain and make
domain knowledge explicit
Decision
Design
Implementation
Testing
Analysis
2.2 DSL
Requirements
Specification
2.3 Domain
Modelling
28. Method: The Analysis Stage
28
A
2.1 Iteration
Planning
Product Backlog
Classification Requirements
Previous Iterations (Done)
Annotate Variations with Gene
Filter Variations by Gene
Report Variations’ Properties
Report Variations’ Gene
Current Iteration (To do)
Read Genotypes of several samples from a VCF File
Annotate Variations with Transcripts Names
Annotate Variations with POLYPHEN predicted effect
Filter Variations by POLYPHEN predicted effect
Report Variation’s POLYPHEN predicted effect
Decision
Design
Implementation
Testing
Analysis
29. Method: The Analysis Stage
29
Decision
Design
Implementation
Testing
2.2 DSL
Requirements
Specification
Analysis
User Story Filter Variations by Polyphen predicted effect
Description As a geneticist, I want to filter the sample’s variations by the predicted effect by
POLYPHEN (probably_damaging, possibly_damaging, benign), so that I can see only
the variations that pass the filter”
Role Mandatory Action Goal
Geneticist No Filter sample’s variations by a set of
POLYPHEN predicted effects (benign,
possibly_damaging, probably_damaging)
Seeing only the
variations that pass
the filter
Acceptance Test AT1
Description As a geneticist, given the variations chr2:g.136438366A>G {}, chr11:g.111959693G>T
{probably damaging}, chr17:g.41245471C>T {benign}, when I filter the variations by
the POLYPHEN predicted effect possibly damaging I will see the variation
chr11:g.111959693G>T
Role Input Action Response
Geneticist chr2:g.136438366A>G {}
chr11:g.111959693G>T {probably
damaging}
chr17:g.41245471C>T {benign}
Filter by
POLYPHEN
damaging
chr11:g.111959693G>T
{probably damaging}
Mechanism M1: End-User requirement templates
User Story: Need of end-users
Acceptance Test: Real example of this need
30. Method: The Analysis Stage
30
Usage Scenario Usage Scenario Diabetes Mellitus Type 2 (Analysis 1)
Description In order to research the diabetes mellitus type 2 disease:
I want to read the genotypes of several samples from a VCF file.
I want to annotate the variations with their genes, with all the names of the
transcripts that they hit, and the score and predicted effect of POLYPHEN.
I want to filter the variations by the diabetes genes “ABCC8, CAPN10,KCNJ11,
GCGR, SLC2A2, HNF4A, INS, INSR, PPARG, TCFl2, ADIPOQ, AKT2, PAX4,
MAPK81p1, GPD2, MNTR1B”, and by “possibly damaging” or “probably
damaging” variations according to POLYPHEN.
I want to create a report with the variations main properties, their genes,
their transcript names, and their POLYPHEN predictions.
Decision
Design
Implementation
Testing
Analysis
Dependency DP1
Description When I filter variations by POLYPHEN predicted effects, if variations have not been
annotated with POLYPHEN predicted effect, I will see the error “Variations must be
annotated with POLYPHEN predicted effect before filtering”
Precondition Action Error Message
Annotate variations with
POLYPHEN predicted effect
Filter variations by a set
of POLYPHEN predicted
effects
“Variations must be annotated with
POLYPHEN predicted effect before
filtering”
2.2 DSL
Requirements
Specification
Dependency: Relationships between user stories
Usage Scenario: Real example of several user stories
Mechanism M1: End-User requirement templates
31. Method: The Analysis Stage
31
User Story Filter by Polyphen predicted effect
Description As DSL user, I want to order a filter by a list of POLYPHEN predicted effects, so that
variations can be filtered by these predicted effects
Role Mandatory Action Goal
DSL user No Write Filter and a list of
POLYPHEN predicted effects
Variations can be filtered by
these predicted effects
Acceptance Test
Dependency
Usage Scenarios
Decision
Design
Implementation
Testing
Analysis
Mechanism M1: Language requirement templates
2.2 DSL
Requirements
Specification
User Story Filter Variations by Polyphen predicted effect
Description As a geneticist, I want to filter the sample’s variations by the predicted effect by
POLYPHEN (probably_damaging, possibly_damaging, benign), so that I can see only
the variations that pass the filter”
Role Mandatory Action Goal
Geneticist No Filter sample’s variations by a set of
POLYPHEN predicted effects (benign,
possibly_damaging, probably_damaging)
Seeing only the
variations that pass
the filter
32. Filter
Gene Predicted
effect
Annotate
Predicted
effect
Variation Analysis
Genetic Analysis
User Story
As a DSL user, I want to order a filter by a list of POLYPHEN predicted effects,
so that variations can be filtered by these predicted effects
Method: The Analysis Stage
32
A
2.3 Domain
Modelling
Feature Model Concepts Model Vocabulary
Decision
Design
Implementation
Testing
Analysis
ACTION
33. User Story
As a DSL user, I want to order a filter by a list of POLYPHEN predicted effects,
so that variations can be filtered by these predicted effects
Filter
Gene Predicted
effect
Annotate
Predicted
effect
Variation Analysis
Genetic Analysis
Method: The Analysis Stage
33
2.3 Domain
Modelling
Feature Model Concepts Model Vocabulary
Decision
Design
Implementation
Testing
Analysis
ACTION AND GOAL
Variation
Predicted
Effect
Algorithm name
Effect
34. Filter
Gene Predicted
effect
Annotate
Predicted
effect
Variation Analysis
Genetic Analysis
User Story
As a DSL user, I want to order a filter by a list of POLYPHEN predicted effects,
so that variations can be filtered by these predicted effects
Method: The Analysis Stage
34
2.3 Domain
Modelling
Feature Model Concepts Model Vocabulary
Decision
Design
Implementation
Testing
Analysis
ACTION
relationship
Variation
Predicted
Effect
Algorithm name
Effect
*
35. Filter
Gene Predicted
effect
Annotate
Predicted
effect
Variation Analysis
Genetic Analysis
User Story
As a DSL user, I want to order a filter by a list of POLYPHEN predicted effects,
so that variations can be filtered by these predicted effects
Method: The Analysis Stage
35
2.3 Domain
Modelling
Feature Model Concepts Model Vocabulary
Decision
Design
Implementation
Testing
Analysis
Variation
Predicted
Effect
Algorithm name
Effect
Variation: Each of the
nucleotides that the
sample has different in
regards to a reference
sequence
Predicted Effect: Result
of the execution of a
prediction algorithm that
assesses the effect of the
variation in an individual
36. Method: The Design Stage
36
3.1 Syntax
Preferences
Design
Artefacts that specify syntax and semantics
Decision
Analysis
Implementation
Testing
3.2 Abstract and
Concrete Syntax
3.3 Semantic
Restrictions
3.3 Behavioral
Semantics
37. Method: The Design Stage
37
3.1 Syntax
Preferences
Design
Decision to be made: Internal vs External
Decision
Analysis
Implementation
Testing
Existing
Language?
1
Programming
Libraries
2
Learn New
Language
4
Syntax
Freedom
3
38. Method: The Design Stage
38
A
3.2 Abstract and
Concrete Syntax
Feature
Model
Concepts
Model
Relationships
Abstract Syntax Metamodel
*
PredictedEffectF
Predicted Effect
AlgorithmName
Effect
Filter
Gene
*
disjoint
Filter
Gene Predicted
effect
Variation
Predicted
Effect
Algorithm name
Effect
Design
Decision
Analysis
Implementation
Testing
39. Method: The Design Stage
39
3.2 Abstract and
Concrete Syntax
Usage Scenario:
As a DSL user, I want to annotate the variations with their predicted effect of POLYPHEN and I
want to filter the variations by the polyphen predicted effect damaging
Annotate Variations with POLYPHEN
Filter Variations by predicted effect
POLYPHEN damaging
GeneticAnalysis.Annotation(POLYPHEN)
GeneticAnalysis.Filter(POLYPHEN,effect,
damaging)
<Annotate> </POLYPHEN</Annotate>
<Filter>
<POLYPHEN>
<effect>damaging </effect>
</POLYPHEN>
</ Filter >
Syntax 2
Syntax n
…
Syntax 1
Mechanism M2: Syntax Questionnaire
Design
Decision
Analysis
Implementation
Testing
Favorite
Syntax
41. Method: The Design Stage
41
3.3 Semantic
Restrictions
Integrity Constraint
when PredictedEffectF
if PredictedEffectA exists
then “ok”
else
“Variations must be annotated
with POLYPHEN predicted
effect before filtering”
Feature Model Variation Analysis
Predicted
Effect
dependency
Annotate Filter
Predicted
Effect
Dependency:
When I write filter and a list of POLYPHEN predicted effect, If
annotated with POLYPHEN predicted effect has not been written, I
will see the error “Variations must be annotated with POLYPHEN
predicted effect before filtering”
Design
Decision
Analysis
Implementation
Testing
42. Method: The Design Stage
42
3.3 Behavioral
Semantics
User Story:
As a DSL user, I want to order a filter by a list of POLYPHEN predicted
effects, so that variations can be filtered by these predicted effects
Mechanism M3: Semantic Templates
Design
Decision
Analysis
Implementation
Testing
User Story Filter Variations by predicted effect POLYPHEN
Service Identifier Ensembl Filter VEP
Source description Galaxy
Inputs Description Type Constant Value
Input File that gathers the
variations
DataFile (VCF) False -
FilterCriteria Evaluation expression
that indicates the
polyphen criteria to
filter
String False Examples: “Polyphen
is benign” “Polyphen
is possibly_damaging”
Outputs Description Type Visibility
annotated_vcf File that gathers the
annotated variations
DataFile (VCF) True
43. Method: The Implementation Stage
43
A
4.1 Test
Specification
4.2 DSL
Infrastructure
Implementation
Implementation
DSL infrastructure for using the DSL
Model-driven development MDD (design models)
&
Test-driven development TDD (tests)
Decision
Analysis
Design
Testing
45. Method: The Testing Stage
45
5.1
Demonstration
Testing
Decision
Analysis
Design
Implementation
End-users test the current DSL
release and provide feedback about
it
5.2 DSL
Infrastructure
Testing
46. Method: The Testing Stage
46
5.1
Demonstration
Usage Scenario
As a DSL user, I want to annotate the variations with their predicted effect of POLYPHEN and I
want to filter the variations by the polyphen predicted effect damaging
Mechanism M4: Demonstration
1. Demonstration of one usage scenario
Testing
Decision
Analysis
Design
Implementation
47. Method: The Testing Stage
47
5.1
Demonstration
A
Testing
2. Description of editor help and shortcuts
Decision
Analysis
Design
Implementation
Usage Scenario
As a DSL user, I want to annotate the variations with their predicted effect of POLYPHEN and I
want to filter the variations by the polyphen predicted effect damaging
Mechanism M4: Demonstration
48. Method: The Testing Stage
48
5.1
Demonstration
A
3. Explanation of error messages
Testing
Decision
Analysis
Design
Implementation
Usage Scenario
As a DSL user, I want to annotate the variations with their predicted effect of POLYPHEN and I
want to filter the variations by the polyphen predicted effect damaging
Mechanism M4: Demonstration
49. Method: The Testing Stage
49
5.1
Demonstration
4. Generation of the artefacts
Testing
Decision
Analysis
Design
Implementation
Usage Scenario
As a DSL user, I want to annotate the variations with their predicted effect of POLYPHEN and I
want to filter the variations by the polyphen predicted effect damaging
Mechanism M4: Demonstration
50. Method: The Testing Stage
50
A
5.2 DSL
Infrastructure
Testing
Questions for testing Requirements
Coverage
Did you find any erroneous step/instruction?
Did you find in the language any step that contains come erroneous aspect?
Did you miss any essential step/instruction?
Questions for testing Syntax
Expressivity
Would you add, change, remove or reorder any word of the language?
Is the language easy to understand?
Is the language intuitive to use?
Coverage Did you find a combination of words that were incorrect but they could be written with the
DSL?
Questions for testing Semantic restrictions
Expressivity Did you find any error message that you did not understand?
Coverage
Did you find a combination of constructs that were incorrect but they could by written with
the DSL?
Did you find any step that was dependent of another one but it could be written without
satisfying that dependency?
Questions for testing Behavioral Semantics
Completeness
Do you know any new software that suits better to implement a step/instruction?
Did you find any error after executing the generated artefact?
Testing
Decision
Analysis
Design
Implementation
Mechanism M5: Testing Questionnaire
51. Index
51
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demonstration
7. Conclusions and Future work
52. Validation
Researching experts opinion (Wieringa, 2012)
Goal: Validate whether the mechanisms proposed M1-M5 are
suitable to involve end-users in DSL development
52
53. Solution Validation
Experiment Methodology: State-of-the-art guidelines
• Experimentation in Software Engineering (Wohlin et al.): For
planning, scoping, executing and analysing data.
• The Method Evaluation Model (Moody et al.): For metrics and
measurement instruments.
Participants: 3 Geneticists from
Experiment design: One factor-one treatment
• Factor: Approach to involve end-users in DSL development
• Treatment: The set of mechanisms proposed (M1-M5)
• All subjects applied the same treatment
53
2
2Instituto de Investigación Sanitaria INCLIVA. www.incliva.es
54. Validation
Description
Response
Variables
Metric
Measurement
procedure
RQ1
Are end-users satisfied with the
feedback provided through the
involving mechanisms?
End-users’
Satisfaction
PEOU and PU
Satisfaction
Questionnaire
RQ2
Are developers satisfied with
the feedback gathered through
the involving mechanism to
build the DSL?
Developers’
Satisfaction
Comprehension
questions, degree of
agreement, and
undetected errors.
Observation, recording,
and analysis of subjects’
feedback and
anecdotes.
RQ3
How long does the application
of the mechanisms for involving
end-users take?
Time Minutes
Measurement of time
spent
54
Research questions of the experiment:
55. Validation
Experiment Procedure
55
Meeting 1 Meeting 2
Experimental Objects
Describe
domain
Select
experimental
objects
Present the
experiment
Gather details about
experimental objects
Run experiment
Apply Mechanism
Mi
EO1. Read VCF file.
EO2. Annotate variations with effect prediction.
EO3. Filter variations by effect prediction.
EO4. Annotate variations sample frequency.
56. Validation: Results
RQ1: End-users’ perspective
High satisfaction with the mechanisms: High (positive) values of PEOU and PU
RQ2: Developers’ perspective
High satisfaction with the mechanisms: Low values of comprehension question,
moderate values of degree of agreement and low values of undetected errors
56
Mechanism Limitation
M1 Small errors are easy to miss
Close relationships with other requirements easy to miss
M2 Questions about abstract syntax are not friendly to provide feedback
Set of questions not enough to reach an agreement
M3 Unclear field in the semantic template
M4 Feedback should be encouraged during demonstration not at the end
M5 Some ambiguous questions
57. Validation: Results
RQ3: Time spent
• Mechanisms M1, M3 and M5 are the hardest for end-users.
• The participation of end-users for one iteration that addresses four
DSL requirements is approximately of 2h30m.
57
58. Validation: Conclusions
Assess the satisfaction of end-users with the
mechanisms M1-M5 proposed in the method.
Found limitations that allow us to propose
improvements in the method.
Results are not statistically significant and cannot be
generalized.
Valuable opinions that originate from experts of the
industrial environment.
58
59. Index
59
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demonstration
7. Conclusions and Future Work
61. Index
61
1. PhD Motivation
2. PhD Goals
3. State of the Art
4. Method
5. Validation
6. Demonstration
7. Conclusions and future work
62. Conclusions: PhD Contributions
1. Analysis of the problem of involving end-users in DSL
development.
2. State-of-the-art regarding DSL development approaches
that involve end-users.
3. An innovative method to involve end-users in DSL
development.
4. A DSL for supporting genetic analysis.
5. Validation of the proposed method together with
geneticists.
62
63. Conclusions: Research Publications
Type Forum Ranking C1 C2 C3 C4 C5
Regular paper RCIS (2010) Core B - - - -
Short paper CAiSE Forum (2011) - - - - -
Book chapter CAiSE Forum Selected
Papers Springer - - - -
Short paper Bioinformatics (2011) - - - - -
Doctoral consortium RCIS (2012) Core B - -
Regular paper ENASE (2013) Core B - -
Regular paper ISD (2013) Core A - -
Oral communication CONBIOPREVAL - - - - -
Workshop paper COBI (2015) - - - -
Journal Article
(submitted)
Journal of Software and
Systems (2016) JCR - -
63
64. Conclusions: R&D Collaborations
• Analysis of the problem in a real environment
• Design of the mechanisms of the method
• Feedback about initial versions of the method
• Validation of the method through an empirical experiment
• Application of the method to develop a DSL for genetic
analysis.
64
3
3GEM Biosoft. www.gembiosoft.es
65. Conclusions: Lessons Learned
Collaboration with geneticists: Developing a DSL for a
complex application domain requires the participation of
domain experts.
Combination of MDD and agile principles allows getting
feedback from end-users and propagate this feedback to the
whole development lifecycle.
The conducted empirical experiment and application of the
method to develop the DSL for genetic analysis demonstrate
that the proposed method can be applied in practice.
65
66. Future Work
Method improvements:
• Supporting the gathering of non-functional requirements
• Supporting internal and graphical DSLs.
• Exploring semantic specification alternatives.
• Providing tool support for the method
• Detailing the Deployment and Maintenance stages
Validation of the current version of the DSL with geneticists.
66
Good morning, thanks for the presentation and thank you to the members of the tribunal for being here today for the presentation of my PhD thesis: An agile Model-Driven Method for Involving End-Users in DSL development.
This thesis has been supervised by the Dr. Óscar Pastor López and the Dr. Francisco Valverde Giromé.
This is the index I am going to follow for this presentation.
First I want give you the sense of what is this PhD about before starting in the second point to describe the details of the problem and the PhD goals.
After describing the problem, I will provide an overview of the state of the art to find whether the problem is already solved or not.
Since the problem would not be not solved (I’m providing now a heads-up), I will explain the solution that we have proposed in this PHD. Then, I will explain how we have validated the solution and how we have applied it in practice.
Finally, I will state the conclusions and the future work.
First, I want to explain what is this PhD is about. This PhD is about software languages and how they have traditionally helped us to communicate with computers.
Initially, communicating with computers meant writing programs in binary code. However, the difficulty of this task has lead us to constantly seek for better abstractions that allows us to describe more complex programs easily. First, using assembly language, and then, using general purpose language as python.
But the interest for higher abstractions is not unique among software developers. Similar procedures are applied in other domains such as genetics. In this domain, geneticists try to understand the behavior of the DNA first by codifying it with letters that represent nucleotides and aminoacides. And then, by means of pathways that represent the interactions among their chemical bases.
For software developers, the use of general purpose languages was not enough and in the seek of even higher abstractions for improving the understanding of software development, the software engineering community proposed domain-specific languages (DSLs), which are according to Van Deursen.
Examples of DSLs for software development are:
SQL, a textual DSL for describing operations over databases, and UML Activity Diagram, a visual DSL for describing the requirements of a system by means of activities to accomplish.
Traditionally, developers have been developing DSLs that facilitate their tasks while developing software. This kind of DSLs target technical/technology domains such as database management or web applications development.
However, with time, DSLs have gained interest not only among pure software developers but in experts of application domains such as seismology, genetics, or aviation. Here is where the development is a challenge.
For technical domains, the domain experts and the developers are the same, or at least, acquiring such domain knowledge is not that difficult for a software developer.
However, for application domains, domain experts are not the developers, and there is a huge knowledge gap between them.
In this PhD, we aim to reduce this knowledge gap between experts and developers while developing DSLs for complex application domains.
The motivation of this thesis originated from Industry, when we started collaborating with IMEGEN, a Valencian SME whose expertise is the analysis of genetic and genomic diagnoses.
In summary, what they have told us is “We have problems to analyze genetic data”, “We need to use state-of-the-art analytic tools”, and “We need a tool highly customizable to our needs”
As a result of this collaboration, our diagnosis is that a new genetic tools is an unsustainable solution because the domain is in constant evolution. They needed an infrastructure for customization of their genetic analysis.
Therefore a possible solution worth to explore is the creation of a DSL for supporting genetic analysis. But how can we develop this DSL if we are not experts in genetic analysis
‘casta-maisa-bol
‘castamai-saison
This problem become an academic problem, because we wanted to develop a DSL for supporting genetic analysis, but we don’t have genetic knowledge. In order to fulfill this lack, we need the collaboration of geneticists to develop this DSL, but they don’t have development knowledge.
Our diagnosis is that, we need guidance to involve geneticists in the development of this DSL
As a solution, we could follow a DSL development approach to involve end-users in DSL development.
The problem is that traditional DSL development approaches do not take into account end-users.
Once we know what is this PhD about, we provide further details about the problem addressed in this PhD.
As a consequence, the goal of this PhD is to propose a DSL development approach to involve end-users.
An approach that allows us to involve geneticists in the development of this DSL, but also, an approach that can be used by future developers to involve end-users of any other complex application domain.
Once we know the problem of the PhD and the PhD goals, we seek in the state of the art whether this problem is already solved.
While seeking in the state of the art of DSL development, we found, methodologies, guidelines, and best practices. However, few consider end-users during the development process.
From them, we selected the ones that take into account end-users and we categorized them into three categories. First, the ones that take into account the preferences of end-users during DSL development, although they do not involve them in the process. Examples of this are the work of Perez et al, and the work of Nishino. Second, the ones that apply an agile process to get early feedback from end-users . Examples of this kind are Sadilek and Barisik et al.
Third, the ones that involve end-users in the DSL development. Examples of this kind are: Wuest et al., Cho et al., Sanchez-Cuadrado et al, Canovas et al.
Only take into account end-users: Two of the approaches focus on taking into account end-users but they are not involved in the process (Perez et al. and Nishino).
Most supported stages: The rest of approaches involve end-users in some activities of the analysis and design stages.
Not supported activities: None approach involves end-users in behavioural semantics specification or in the maintenance stage.
Partial supported activities: Only one approach involves end-users in the testing stage and is only for asking about usability (Barisik et al.).
Completeness: None approach can be applied in practice from the Decision stage to the Maintenance stage.
Therefore, next we describe our proposed solution.
Our solution to fulfill the three aformentioned requirements is an agile model-driven method to involve end-users in DSL development.
The approach to build these method consists in adopting the stages and patterns for DSL development from Mernik et al., so we could build the stages and steps of the method.
The guidelines of Strembeck and Voelter et al. (focusing on MDD practices) to propose the steps and artefacts of the method.
And observing the agile practices from agile method such as XP, Scrum and Agile Modeling, in order to propose a set of mechanisms for involving end-users.
Our solution to fulfill the three aformentioned requirements is an agile model-driven method to involve end-users in DSL development.
The approach to build these method consists in adopting the stages and patterns for DSL development from Mernik et al., so we could build the stages and steps of the method.
The guidelines of Strembeck and Voelter et al. (focusing on MDD practices) to propose the steps and artefacts of the method.
And observing the agile practices from agile method such as XP, Scrum and Agile Modeling, in order to propose a set of mechanisms for involving end-users.
Our solution to fulfill the three aformentioned requirements is an agile model-driven method to involve end-users in DSL development.
The approach to build these method consists in adopting the stages and patterns for DSL development from Mernik et al., so we could build the stages and steps of the method.
The guidelines of Strembeck and Voelter et al. (focusing on MDD practices) to propose the steps and artefacts of the method.
And observing the agile practices from agile method such as XP, Scrum and Agile Modeling, in order to propose a set of mechanisms for involving end-users.
Following the agile practices “iteration planning” and “incremental design”, we organized the development process as an iterative cycle made of the stages Analysis Design, Implementation and Testing.
Decision is left outside the cycle because the decisión to develop a DSL is only addressed once. Deployment and maintenance are also outside the cycle because they are only addressed when there is an stable versión of the DSL that is worth to try by the end-users.
This could be a posible structure to specify this trip. A set of Language constructs that the travel agent can specify so that the system books the hotel, the restaurant, and the tickets for the museum and the disco.
We are going to use this example to explain the stages, steps, artefacts, and mechanisms for involving end-users of the method proposed.
Once we have decided to develop this DSL, in the analysis stage, we must understand the domain and make end-users knowledge explicit.
First, in the iteration planning, we must plan what requirements to address in the current iteration.
In order to keep track of the requirements during the development we use a product backlog, which organizes the requirements so that end-users can check the current state of the DSL anytime.
For the example, in the current iteration we place the booking of an hotel by location, a restaurant by name, museum tickets and disco tickets.
Once we have decided to develop this DSL, in the analysis stage, we must understand the domain and make end-users knowledge explicit.
First, in the iteration planning, we must plan what requirements to address in the current iteration.
In order to keep track of the requirements during the development we use a product backlog, which organizes the requirements so that end-users can check the current state of the DSL anytime.
For the example, in the current iteration we place the booking of an hotel by location, a restaurant by name, museum tickets and disco tickets.
Taking into account the product backlog of the current iteration, in the step “DSL Requirements Specification” the developers and the end-users collaborate to describe:
User Stories, which describe how an end-user with a role needs to play an action to achieve a goal. For the user story “Book hotel by Name”, the role is the travel agent, the action is to book a hotel close to a certain location of a range of days and the goal is to achieve that clients have accommodation close to their preferred location during their holidays.
Acceptance Tests, which describe real examples of this need, how an end-user with a role, given an specific input context, executes an action and obtains a response,
Usage Scenarios, which describe an example of a set of these needs together.
These three elements are agile practices to describe end-user requirements in an easy way closer to end-users: using natural Language and a predefined structure. These two templates represent the first mechanism (M1) for involving end-users.
These templates describe the requirements of the end-users nor language requirements. Since domain experts are not language developers, we cannot ask them to describe language requirements.
Instead, the developers must obtain the language requirements that derive from the previous described end-user requirements. In order to describe them we use the same template but fulfilling the information related with language concerns.
For the example, we can see that the user story is…. In the previous slide, we described the need to book an hotel, but now we are describing the need to describe how to book an hotel.
The same applies for the acceptance tests and the usage scenarios.
Once we have obtained the language requirements, in the domain modeling step we make explicit this knowledge by means of a domain model made of a feature model, a concepts model, and a vocabulary.
In order to obtain these models, we use the user stories.
For the feature model, we used the action of the user story to create a feature in the feature model. This feature is a summary of this action.
For the concepts model, we create a concept for each domain concept that is both in the action and the goal of the user story.
Next, we create the relationships between the feature model and the concepts model by observing those concepts that were obtained from the action of the user story.
For the vocabulary, end-users collaborate to define each of the concepts obtained.
After the analysis stage, in the Design stage we must create the artefacts that specify the syntax and semantics of the DSL.
According to the patterns of Mernik et al., we must decide between using an internal or external approach for the syntax design. Internal means to use an existing language to build the new DSL while external means to create a new language from scratch.
In order to make this decision, we propose a set of questions to ask end-users about their preferences (Fowler). These questions ask end-users whether they know an existing language, whether they need to use existing programming libraries, whether they need any kind of freedom for the new syntax, and whether they mind to learn a new language. Also, we ask them about which of these aspects are more important for them
With their responses, the developers decided between an internal or a external solution to design the DSL.
From this point, the method only supports external DSLs. As we already explained in the document, this decision was driven by the context of the PhD in regards to the DSL for genetic analysis.
After the analysis stage, in the Design stage we must create the artefacts that specify the syntax and semantics of the DSL.
According to the patterns of Mernik et al., we must decide between using an internal or external approach for the syntax design. Internal means to use an existing language to build the new DSL while external means to create a new language from scratch.
In order to make this decision, we propose a set of questions to ask end-users about their preferences (Fowler). These questions ask end-users whether they know an existing language, whether they need to use existing programming libraries, whether they need any kind of freedom for the new syntax, and whether they mind to learn a new language. Also, we ask them about which of these aspects are more important for them
With their responses, the developers decided between an internal or a external solution to design the DSL.
From this point, the method only supports external DSLs. As we already explained in the document, this decision was driven by the context of the PhD in regards to the DSL for genetic analysis.
Disjoint specialization: When feature childs are single choice
Overlapping specialization: When feature childs are multiple choice
Once we have decided the approach to design the DSL, we must specify the abstract and concrete syntax.
In order to design the syntax we build first the abstract syntax metamodel. To build this model have proposed a set of guidelines that project the information from the analysis models into entities and relationships of the abstract syntax metamodel.
After designing the abstract syntax metamodel, we must design the concrete syntax.
In order to design this syntax we use one usage scenario for the analysis to propose several syntaxes.
Once we have designed these syntaxes, we handle a questionnaire to end-users to ask about which is their preferred one, if they would prefer to propose a new one, and also to provide feedback about the specific structure.
From their responses, the developers select the favorite syntax.
This questionnaire is the mechanism M2 proposed to involve end-usres.
Once we know the end-user preferences, we build the concrete syntax grammar that describes the syntax chosen for end-users.
In order to build this grammar, we propose a set of guidelines that project the information of the abstract syntax metamodel into production rules of the grammar.
The next step is to design the semantic restrictions.
In order to design them we propose to describe them as restrictions in the abstract syntax metamodel and in the concrete syntax grammar.
In order to create them. We have propose a set of guidelines to project the information of the acceptance tests and the feature model into an integrity constraint of the abstract syntax metamodel. We describe them using a when-if-else structure.
Finally, in order to describe behavioral semantics, we propose to describe them by means of services.
First, according to the architectural envisioning agile practices, together with end-users it is defined a technological strategy.
Then, we propose that end-users and developers collaborate to specify these services in the implementation target chosen.
To do so, for each of the user story of the analysis we propose to create the following template.
For the example, imagine that we have chosen scripts in Unix as the technological implementation strategy.
For the user story “book hotel by location”, we specify that the service identifier is BookAccommodation.pl, which is a desktop program written in perl provided by Booking.com.
Then, together with the travel agents, we describe the inputs and outputs. Inputs are defined with the fields, description, type, constast, and value. For example, this service has an input that is name type, which is the type of accommodation, its type is an enumeration, and in this case, since we want to book an hotel, it will be always constant and its value will be “Hotel”
As output, this service shows a message with the confirmation about the booking, which we indicate that must be shown by indicating yes in its visibility.
This template is the mechanism M3 for involving end-users.
Once the design artefacts have been created, in the implementation stage, the developers create the complete infrastructure that supports the usage of the DSL, that is, understands specifications written using the DSL and obtains a set of generated artefacts that implement the behavior corresponding to that specification.
The DSL infrastructure is made by a Parser, a Validator and a Generator.
In order to implement these elements, we will apply both MDD using the design models and the agile practice TDD. Hence, the first step of this stage is to specify the set of tests that will be used for the TDD approach.
For the TDD we will create three type of tests: syntax tests that check the parser, semantic tests that check the validator and the generator, and target platform tests that check the correctness of the generated artefacts.
First, we create the syntax tests, which are responsable to check if the parser is able to understand specifications writen using the syntax.
Initially, these tests were supposed to be used to generate the parser applying TDD, however, we found a tecnological approach that already implemented the parser applying MDD from the abstract syntax metamodel and the concrete syntax grammar. This approach implements both models, and runs automatically a generation engine in order to create the source code of the parser.
Once this infrastructure is implemented, the end-users test the current DSL release and provide feedback about it.
Once this infrastructure is implemented, the end-users test the current DSL release and provide feedback about it.
In order to fulfill this main goal, we must accomplish four objectives.
First, we create the syntax tests, which are responsable to check if the parser is able to understand specifications writen using the syntax.
Initially, these tests were supposed to be used to generate the parser applying TDD, however, we found a tecnological approach that already implemented the parser applying MDD from the abstract syntax metamodel and the concrete syntax grammar. This approach implements both models, and runs automatically a generation engine in order to create the source code of the parser.
Then, we create the Semantic Tests, which test that the semantic restrictions are checked and the corresponding errors arise when these restrictions are violated.
In order to implement the validator, we apply TDD with these tests. We run all tests, and if some test does not suceed, we chose them, and program the corresponding validation rules that makes the tests to suceed. When all the tests succeed it means that the validator source code is complete.
And finally, Target platform tests, check the behavior of the generated artefacts according to the needs expressed by the end-users.
In order to implement these fragments, we apply TDD with these tests. We run all tests, and if some test does not suceed, we chose them, and program the corresponding fragments that makes the tests to suceed. In order to program this source code weuse the semantic templates. When all the tests succeed the set of fragments is complete.
And finally, Target platform tests, check the behavior of the generated artefacts according to the needs expressed by the end-users.
In order to implement these fragments, we apply TDD with these tests. We run all tests, and if some test does not suceed, we chose them, and program the corresponding fragments that makes the tests to suceed. In order to program this source code weuse the semantic templates. When all the tests succeed the set of fragments is complete.