Successfully reported this slideshow.
Your SlideShare is downloading. ×

Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization [VISSOFT 2018]

Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization [VISSOFT 2018]

Download to read offline

The beginning of every software analysis and visualization process is data acquisition. However, there are various sources of data about a software system. The methods used to extract the relevant data are as diverse as the sources are. Furthermore, integration and storage of heterogeneous data from different software artifacts to form a unified data source are very challenging. In this paper, we introduce an extensible open source stack to take the first step to solve these challenges. We show its feasibility by analyzing and visualizing JUnit and provide answers regarding the schema, selection, and implementation of software artifacts’ data.

In VISSOFT'18: Proceedings of the 6th IEEE Working Conference on Software Visualization, 2018. Paper preprint: https://easychair.org/publications/preprint/893N

The beginning of every software analysis and visualization process is data acquisition. However, there are various sources of data about a software system. The methods used to extract the relevant data are as diverse as the sources are. Furthermore, integration and storage of heterogeneous data from different software artifacts to form a unified data source are very challenging. In this paper, we introduce an extensible open source stack to take the first step to solve these challenges. We show its feasibility by analyzing and visualizing JUnit and provide answers regarding the schema, selection, and implementation of software artifacts’ data.

In VISSOFT'18: Proceedings of the 6th IEEE Working Conference on Software Visualization, 2018. Paper preprint: https://easychair.org/publications/preprint/893N

Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization [VISSOFT 2018]

  1. 1. TOWARDS AN OPEN SOURCE STACK TO CREATE A UNIFIED DATA SOURCE FOR SOFTWARE ANALYSIS AND VISUALIZATION VISSOFT 2018 Madrid, September 24, 2018 Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer
  2. 2. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer HOW TO GET A UNIFIED DATA SOURCE FOR SOFTWARE ANALYSIS AND VISUALIZATION? 2 ? [Diehl 2007]
  3. 3. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer HOW TO GET A UNIFIED DATA SOURCE FOR SOFTWARE ANALYSIS AND VISUALIZATION? 3 Software data naturally maps to a multivariate, compound, attributed, and time-dependent graph [Diehl & Telea 2014]
  4. 4. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer CHALLENGES IN CREATING, STORING, AND QUERYING SOFTWARE DATA 1. Schema  How to model a given aspect of a software system in terms of entities, relations, and attributes? 2. Selection  How to select data relevant for a given task from an entire graph? 3. Implementation  How to store the graph in a way that is efficient for quickly reading and writing large amounts of data? 4 [Diehl & Telea 2014]
  5. 5. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer OPEN SOURCE STACK 5
  6. 6. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer DATA ACQUISITION  Plugins scan different kinds of software artifacts, e.g.,  Java bytecode, Git, Jacoco, Findbugs  The extracted data is stored as a graph in a Neo4j database 6
  7. 7. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer ANALYSIS  Aggregation and enrichment  Rules analyze and modify the graphs  Concepts aggregate, enrich, and unify the data 7
  8. 8. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer ANALYSIS  Filtering  Cypher queries filter relevant data and return tabular or graph results 8
  9. 9. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer VISUALIZATION  Views are dynamically created based on the query result using D3 visualization components embedded in a React app 9 Hotspot analysis view type = nesting of circles loc = size of circles commits = color of circles
  10. 10. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer PROOF OF CONCEPT  Software Analysis and Visualization Dashboard as one visualization frontend for the open source stack  Supports project leaders in decision-making  Provides interactive views concerning architecture and dependencies as well as resource, risk, and quality management  https://github.com/softvis-research/jqa-dashboard  Screencast demonstrating the data acquisition, analysis, and visualization of JUnit  https://youtu.be/LebVqfzQ_KE 10
  11. 11. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer RELATED WORK  Tools for parsing, modeling, and querying software data  Moose [Nierstrasz et al. 2005]  Famix, Dynamix, and Hismo [Ducasse et al. 2011, Greevy 2007, Gîrba et al. 2005]  Rascal [Klint et al. 2009]  Recent approaches using a graph database for storing and querying software data  VerX-Combo [Yano et al. 2015]  Swarm Debugging Prototype [Petrillo et al. 2015]  VIMETRIK and KNIME [Khan et al. 2015, Berthold et al. 2009] 11
  12. 12. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer BENEFITS OF THE PROPOSED STACK  Fills research gap  Tackles the challenges in creating, storing, and querying heterogeneous software data  Customizable  Is extendable with scanner plugins for further software artifacts  Reusable  Has an interchangeable visualization frontend  Open Source  Consists of loosely coupled Open Source components 12
  13. 13. VISSOFT 2018 | Towards an Open Source Stack to Create a Unified Data Source for Software Analysis and Visualization Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer REFERENCES  S. Diehl, Software Visualization: Visualizing the Structure, Behaviour, and Evolution of Software. Springer, 2007.  S. Diehl and A. C. Telea, Multivariate Graphs in Software Engineering, ser. Lecture Notes in Computer Science. Cham: Springer International Publishing, 2014, vol. 8380, ch. 2, pp. 13–36.  O. Nierstrasz, S. Ducasse, and T. Gîrba, “The story of moose: an agile reengineering environment,” in Proc. 10th Eur. Softw. Eng. Conf. held jointly with 13th SIGSOFT Int. Symp. Found. Softw. Eng., vol. 30. Lisbon, Portugal: ACM, 2005, pp. 1–10.  S. Ducasse, N. Anquetil, U. Bhatti, A. C. Hora, J. Laval, and T. Gˆırba, “MSE and FAMIX 3.0: an interexchange format and source code model family,” p. 40, 2011.  O. Greevy, “Dynamix - a meta-model to support feature-centric analy- sis,” in 1st Int. Work. FAMIX Moose Reengineering, 2007.  T. Gîrba, J. M. Favre, and S. Ducasse, “Using meta-model transforma- tion to model software evolution,” Electron. Notes Theor. Comput. Sci., vol. 137, no. 3, pp. 57–64, 2005.  P. Klint, T. van der Storm, and J. Vinju, “RASCAL: A Domain Specific Language for Source Code Analysis and Manipulation,” in 2009 9th IEEE Int. Work. Conf. Source Code Anal. Manip., 2009, pp. 168–177.  Y. Yano, R. G. Kula, T. Ishio, and K. Inoue, “VerXCombo: An Interactive Data Visualization of Popular Library Version Combinations,” in 23rd Int. Conf. Progr. Compr., 2015, pp. 291–294.  F. Petrillo, G. Lacerda, M. Pimenta, and C. Freitas, “Visualizing inter- active and shared debugging sessions,” in 3rd IEEE Work. Conf. Softw. Vis., 2015, pp. 140–144.  T. Khan, H. Barthel, A. Ebert, and P. Liggesmeyer, “Visual analytics of software structure and metrics,” in 3rd IEEE Work. Conf. Softw. Vis., 2015, pp. 16–25.  M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, K. Thiel, and B. Wiswedel, “KNIME - the Konstanz information miner,” ACM SIGKDD Explor. Newsl., vol. 11, no. 1, p. 26, 2009. 13
  14. 14. THANK YOU. Richard Müller, Dirk Mahler, Michael Hunger, Jens Nerche, Markus Harrer rmueller@wifa.uni-leipzig.de @rimllr https://github.com/softvis-research http://softvis.wifa.uni-leipzig.de 14

×