Applying static code analysis for domain-specific languages
1. Applying static code analysis
for domain-specific
languages
Iván Ruiz-Rube, Tatiana Person, Juan Manuel Dodero, José Miguel
Mota, Javier Merchán Sánchez-Jara
University of Cádiz (UCA)
University of Salamanca (USAL)
IEEE / ACM 22nd International Conference on
Model Driven Engineering Languages and Systems
2. Ruiz-Rube, I., Person, T., Dodero, J. M., Mota, J.
M., & Sánchez-Jara, J. M. (2019). Applying static
code analysis for domain-specific languages.
Software & Systems Modeling, 1-16.
5. Introduction
Static program analysis is a helpful technique to verify
software quality features.
Source code analysis tools enable to automatically discover
code smells and potential errors. So, they can be used to
estimate the Technical Debt (TD).
TD is also suitable to describe the consequences of rash
developments in several disciplines, besides software
development.
6. Applying static analysis to DSLs
Professionals in many different areas can develop and
describe problem solutions by means of a DSL that is
especially defined for their own field and discipline.
Ad-hoc static analysers have been developed for several
DSLs:
● IEC 61131-3, EDDL and RAPID for PLC programming1
● Puppet for Infrastructure as Code2
● BPMN and BPEL for Business Process Models3
● etc.1 Mandal et al., 2018, Prähofer et al., 2012
2 Sharma et al., 2016, Shambaugh et al., 2016
3 Saad et al., 2013, Heinze et al., 2018
7. Research Question
Typically, quality platforms are initially intended to check
programs written in General-Purpose Languages (GPLs),
such as Java, C++ or Python, among others...
Can source code quality analysis tools be applied to
software artefacts that are developed at the DSL code
level?
9. Scope
Xtext is a widespread (+5000 Xtext grammar in GitHub)
framework to create external text-based DSLs.
Xtext grammars to design
concrete syntax are defined
by using its own specific
language (≈ EBNF).
The framework uses the
ANTLR parser, which
implements an LL top-down
parse algorithm.
10. Scope
The SonarQube quality platform provides dashboards
including quality metrics and detailed reports of issues
(vulnerabilities, bugs and code smells).
Supports a great number of GPLs.
It is open-source.
Includes a mechanism
for parsing new languages and
integrate new rules
Parsers are implemented as Java
classes that use a specific library (API) called SSLR.
11. Problem
The grammar format for designing DSL editors with Xtext
diverges from the grammar format used by SonarQube.
To accomplish the recognition of our DSLs by the quality
platform is time-consuming and error-prone, especially
when it comes to maintaining the consistency of
grammars while evolving the language.
16. Implementation
Process wrapped as Eclipse plug-ins, which provide a new
command option in the contextual menu of .xtext files. It generates
code analysers as SonarQube plug-ins
18. The tool: Vary
A computer environment for typing and running algorithms written
with a pseudocode notation
19. Case study: design & data collection
Objective:
Explore how the computation of metrics for analysing the algorithms written with the
Vary tool can help teachers to automatically assess their students’ programming
assignments.
Settings:
31 students of “Introduction to Programming” in the Degree of Computer Engineering
Procedure:
1. Students had to develop with Vary an algorithm.
2. Teacher manually revised and marked the students’ assignments according to
her own correctness, validity and maintainability criteria.
3. Student marks were computed by applying some linear adjustment functions on
the metrics and quality rules provided by SonarQube.
20. Applying Xtext2Sonar
To analyse algorithms in pseudocode, it is necessary to create the
Java artefacts required by SonarQube.
1300 lines of Java code were automatically generated from the
Xtext grammar file of Vary containing a total of 95 rules.
22. Case study: results
Low/medium correlation between the maintainability degree manually
estimated by the teacher and the value computed with support of the
SonarQube metrics
Insights:
● The tool could be measuring other criteria that were not actually
taken into account during the manual reviewing, or vice versa.
● Formulas for learning assessment should be continuously
improved on a regular basis
● It assures the accurate application of the same criteria for all the
students → more sustainable and fairer assessment
24. The tool: LilyPond
Provides musicians with a DSL to type chord notes combining
melody and lyrics and then generates the graphical output
25. Case study: design & data collection
Objective:
Checking if automatic metrics computation for analysing the quality of Lilypond
music sheets is suitable for musicians
Settings:
20 musicians (researchers, teachers, musicologist, postgraduate students).
Procedure:
1. Fill in a pretest questionnaire
2. Analyse a given sheet music with the naked eye
3. Compare their findings with the syntactic error or bad practice warnings issued
by SonarQube.
4. Fill in a final questionnaire
26. Applying Xtext2Sonar
To analyse music sheet, it is necessary to create the Java artefacts
required by SonarQube.
1050 lines of Java code were automatically generated from the
Xtext grammar file of LilyPond containing a total of 82 rules.
28. Case study: results
Learnability: Have you been able to access to SonarQube and visualise
the errors and bad practices issued by the tool for the sheet music made
with LilyPond? (Yes/No)
→ 93,8% of the musicians with experience with composition software said
Yes
Efficiency: Do you think that the errors and warnings issued by SonarQube
correspond to the ones you found when you observed the sheet music in
PDF? (Yes/No)
→ 65% of the participants gave us a positive response
Satisfaction: Would you consider the inclusion of this type of tool to
analyse the quality of sheet music? (Yes/No)
→ 100% of the participants with experience with music software
responded affirmatively.
Utility: How useful would you consider this tool? (0-10)
30. Conclusion
Problem: DSL toolkits do not include support for advanced
analytics (reports and alerts, derived metrics, charts, historical
data and so on)
Scope: Xtext languages (ANTLR LL* grammars) and the
SonarQube platform
Solution: providing a model-driven based tool that
automatically builds language recognisers for quality platforms.
Use cases: Vary, LilyPond and others (v.g. Sculptor, TANGO
and Eclipse SmartHome)
31. Conclusion
Limitation: the entire process is not completely automated
● Developers must include practitioners’ specific metrics and
quality rules by programming AST visitors in Java or XPath rules
Benefits:
● DSL users will harness the features of quality platforms
● DSL developers will reduce programing efforts
Future works:
● Static analysis of visual DSLs, v.g., Blockly
● Model-driven development of augmented reality-based editors
for DSLs (on-going, under journal review)
Quality management: plan, control, assure and improve the quality of the organisations, products or services. TD claims that doing things quickly and carelessly may cause future costs and effort to fix a software project.
to check programs either written with the built-in languages or with new ones.
is particularly prevalent nowadays, due to its flexibility to parse a great number of GPLs and its open-source nature.
M1: Cyclomatic Complexity (CC)
M2: Source Lines of Code (SLOC)
M3: Percentage of duplicated code (DC)
QR1: Algorithm with no documentation.
QR2: Variable name too long.
QR3: Variable name is too short and maybe is meaningless.
QR4: Use of a input sentence (read) without using an output sentence (write).
QR5: Too many global variables.
QR6: Code lines that are too long.
QR7: Procedure or function with no documentation.
QR8: Lines with bad indentation.
more sustainable and fairer
There are no consecutive silent beats.
When a note is held in the previous beat, a precautionary alteration must be included.
Check that the tempo has been defined at the beginning.
Check that the time change has been made correctly.
The music sheet title must be defined.
The music sheet composer must be defined.
A page step cannot exist in a repeat volta instruction.
Lines of code should not be too long.