The beginning of every software analysis and visualization process is data acquisition. However, there are various sources of data about a software system. The methods used to extract the relevant data are as diverse as the sources are. Furthermore, integration and storage of heterogeneous data from different software artifacts to form a unified data source are very challenging. In this paper, we introduce an extensible open source stack to take the first step to solve these challenges. We show its feasibility by analyzing and visualizing JUnit and provide answers regarding the schema, selection, and implementation of software artifacts’ data.
In VISSOFT'18: Proceedings of the 6th IEEE Working Conference on Software Visualization, 2018. Paper preprint: https://easychair.org/publications/preprint/893N
Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization [VISSOFT 2018]
1. TOWARDS AN OPEN SOURCE STACK TO
CREATE A UNIFIED DATA SOURCE FOR
SOFTWARE ANALYSIS AND VISUALIZATION
VISSOFT 2018
Madrid, September 24, 2018
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche,
Markus Harrer
2. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
HOW TO GET A UNIFIED DATA SOURCE FOR
SOFTWARE ANALYSIS AND VISUALIZATION?
2
?
[Diehl 2007]
3. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
HOW TO GET A UNIFIED DATA SOURCE FOR
SOFTWARE ANALYSIS AND VISUALIZATION?
3
Software data naturally maps to a
multivariate, compound, attributed,
and time-dependent graph
[Diehl & Telea 2014]
4. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
CHALLENGES IN CREATING, STORING, AND
QUERYING SOFTWARE DATA
1. Schema
How to model a given aspect of a software system in
terms of entities, relations, and attributes?
2. Selection
How to select data relevant for a given task from an entire
graph?
3. Implementation
How to store the graph in a way that is efficient for quickly
reading and writing large amounts of data?
4
[Diehl & Telea 2014]
5. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
OPEN SOURCE STACK
5
6. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
DATA ACQUISITION
Plugins scan different kinds of software
artifacts, e.g.,
Java bytecode, Git, Jacoco, Findbugs
The extracted data is stored as a graph in
a Neo4j database
6
7. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
ANALYSIS
Aggregation and enrichment
Rules analyze and modify the graphs
Concepts aggregate, enrich, and unify the data
7
8. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
ANALYSIS
Filtering
Cypher queries filter relevant data and
return tabular or graph results
8
9. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
VISUALIZATION
Views are dynamically created based on the
query result using D3 visualization components
embedded in a React app
9
Hotspot analysis
view
type = nesting of
circles
loc = size of circles
commits = color of
circles
10. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
PROOF OF CONCEPT
Software Analysis and Visualization Dashboard as one
visualization frontend for the open source stack
Supports project leaders in decision-making
Provides interactive views concerning architecture and
dependencies as well as resource, risk, and quality
management
https://github.com/softvis-research/jqa-dashboard
Screencast demonstrating the data acquisition, analysis,
and visualization of JUnit
https://youtu.be/LebVqfzQ_KE
10
11. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
RELATED WORK
Tools for parsing, modeling, and querying software data
Moose [Nierstrasz et al. 2005]
Famix, Dynamix, and Hismo [Ducasse et al. 2011, Greevy 2007, Gîrba
et al. 2005]
Rascal [Klint et al. 2009]
Recent approaches using a graph database for storing
and querying software data
VerX-Combo [Yano et al. 2015]
Swarm Debugging Prototype [Petrillo et al. 2015]
VIMETRIK and KNIME [Khan et al. 2015, Berthold et al. 2009]
11
12. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
BENEFITS OF THE PROPOSED STACK
Fills research gap
Tackles the challenges in creating, storing, and querying
heterogeneous software data
Customizable
Is extendable with scanner plugins for further software
artifacts
Reusable
Has an interchangeable visualization frontend
Open Source
Consists of loosely coupled Open Source components
12
13. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
REFERENCES
S. Diehl, Software Visualization: Visualizing the Structure, Behaviour, and Evolution of Software. Springer, 2007.
S. Diehl and A. C. Telea, Multivariate Graphs in Software Engineering, ser. Lecture Notes in Computer Science. Cham:
Springer International Publishing, 2014, vol. 8380, ch. 2, pp. 13–36.
O. Nierstrasz, S. Ducasse, and T. Gîrba, “The story of moose: an agile reengineering environment,” in Proc. 10th Eur.
Softw. Eng. Conf. held jointly with 13th SIGSOFT Int. Symp. Found. Softw. Eng., vol. 30. Lisbon, Portugal: ACM, 2005,
pp. 1–10.
S. Ducasse, N. Anquetil, U. Bhatti, A. C. Hora, J. Laval, and T. Gˆırba, “MSE and FAMIX 3.0: an interexchange format
and source code model family,” p. 40, 2011.
O. Greevy, “Dynamix - a meta-model to support feature-centric analy- sis,” in 1st Int. Work. FAMIX Moose
Reengineering, 2007.
T. Gîrba, J. M. Favre, and S. Ducasse, “Using meta-model transforma- tion to model software evolution,” Electron.
Notes Theor. Comput. Sci., vol. 137, no. 3, pp. 57–64, 2005.
P. Klint, T. van der Storm, and J. Vinju, “RASCAL: A Domain Specific Language for Source Code Analysis and
Manipulation,” in 2009 9th IEEE Int. Work. Conf. Source Code Anal. Manip., 2009, pp. 168–177.
Y. Yano, R. G. Kula, T. Ishio, and K. Inoue, “VerXCombo: An Interactive Data Visualization of Popular Library Version
Combinations,” in 23rd Int. Conf. Progr. Compr., 2015, pp. 291–294.
F. Petrillo, G. Lacerda, M. Pimenta, and C. Freitas, “Visualizing inter- active and shared debugging sessions,” in 3rd
IEEE Work. Conf. Softw. Vis., 2015, pp. 140–144.
T. Khan, H. Barthel, A. Ebert, and P. Liggesmeyer, “Visual analytics of software structure and metrics,” in 3rd IEEE
Work. Conf. Softw. Vis., 2015, pp. 16–25.
M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, K. Thiel, and B. Wiswedel, “KNIME - the
Konstanz information miner,” ACM SIGKDD Explor. Newsl., vol. 11, no. 1, p. 26, 2009.
13
14. THANK YOU.
Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
rmueller@wifa.uni-leipzig.de
@rimllr
https://github.com/softvis-research
http://softvis.wifa.uni-leipzig.de
14