Web2MexADL - CSMR Presentation

Uploaded on

Presentation of the Web2MexADL project

Presentation of the Web2MexADL project

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • Hello, myname is Juan Carlos Castrejónandtoday I’m going to talk about Web2MexADL, a tool intended to discover and help maintain the architecture of software systems, in particular web applicationsThiswork is part of a collaborationbetweentheTecnológico de Monterrey in Mexicoand a couple of Informaticlaboratories of Grenoble in France
  • First, I’m going to describe the general context and the motivation of our toolThen, I’ll explain the particular objectives and the contribution of web2mexadlI’ll also describe the discovery process and the details of our current implementation. Then, I’m going to demonstrate its use in a sample scenario, showing how it can help verify the maintainability of web systemsFinally, I’ll present our conclusions and future work
  • Let’s begin by talking about a common scenario in software engineering [advance!]We can start by developing particular classes (or components) to implement a required logic in oursystem [advance!]. These components are usually grouped into modules, according to the different functionalities of ourapplication [advance!]. As the system grows larger, we usually develop more than one module, and define interactions between them.[advance!] The representation of the system components, and the relations between them, is documented in one or more architecture views. These views can vary according to the development process that weuse [advance!], or in the particular intent that we try to communicate [advance!]. The architecture views can then guide software development and maintenance.We can rely on one or more architecture patterns to identify the types of components that are part of our system. These patterns convey common structures and interactions that are proved to solve particular requirements.In summary, [advance!] software architecture should guide software development and maintenance. For this, the architecture needs be documented in one ore more architecture views. And for the generarion of these views, we can take architecture patterns as reference models.
  • However, [advance!] when we try to apply this theory to the development of a real-world web application, we may face several problems.First, [advance!] we need to choose an architecture pattern that can serve as reference model for our application. A common choice for web applications is the Model-View-Controller (MVC) pattern, due to its natural separation of business, presentation and control logic.Assuming we rely on the MVC pattern, we are probably going to face the following problems:- [advance!] During system development, how can we be sure that we are really following the MVC pattern?[advance!] For the development of a particular class, how can we know if it’s a model, a view or a controller? [advance!] And finally, can we rely on up-to-date documentation to make this analysis?
  • Software architecture discovery is a popular topic both in research and industrial environments. In particular, the use of probabilistic models to analyze the source code of software systems is a wide spread technique among reverse engineering tools. These methods differ on the combination of random variables, algorithms, and on the nature of training data. For example, the approach described in [Corazza 2010] builds the probabilistic model based on variables, methods and class signatures. However, this model is intended for general use and is not trained with historical data of any particular domain.In [Maqbool 2007] a similar approach is proposed, based on the analysis of global variables and on the definition of a Naïve Bayes classifier. In comparisson, our tool relies on a wider set of variables for the discovery process and is open to a variety of probabilistic models.To recover software systems architecture, we can also rely on open-source or commercial tools such as Klocwork Architect and Structure 101. These tools deliver a good starting point for the analysis of software systems. However, the advantage of Web2MexADL is that the resulting documentation is based on architecture descriptions, that can later be used to verify the maintainability of the system under analysis.
  • Let’s talk about the specific objectives and contribution of our tool
  • Ourtool has two main objectives.The first one, is to recover software architecture for Java web systems, based either on the MVC pattern or on the identification of clusters. The recovered architecture is represented in two architecture views: an Architecture Description Language (ADL) and a Scalable Vectors Graphic (SVG) file. The former includes the system components, their expected interactions and the information required to verify its maintainability intent. The latter includes the classification of each system component, either in MVC layers or Clusters.The second objective of our tool is to help verify the maintainability of the recovered architecture. For this, we rely on the MexADL verification approach. I’ll explain this approach in the following slides.The main contribution of our tool is a probabilistic approach for the generation of architecture documentation based on MexADL.
  • [advance!] Our tool is not an isolated effort. Web2MexADL is part of an initiative intended to support software development based on architecturenotations, andincludestheparticipation of universities in Mexicoand in France [advance!] This initiative includes tools for software architecture definition, discovery and verification. All of these tools are open-source and can be easily added to current Java developmentenvironments, bymeans of Eclipseplugins
  • In particular, our tool relies on MexADL to help verify themaintainability of software systems.MexADLis averification approach, based on the ISO/IEC SQuaRE quality model, that relies on architecture documentation (ADL) containing quality metrics and valid relations between system components. These metrics and relations are then verified using Aspect Oriented Programming.Extension of xADL, from the university of california, irvineMaintainability: degree of effectiveness and efficiency with which a software product can be modifiedThe result of the maintainability verification are two HTML reports, generated after each system compilation. These reports contain the quality metrics and interactions analysis.
  • Now I’ll explain the details of the probabilistic model used by our tool
  • Remember that the objective of our tool is to recover software architecture based either on the MVC pattern or in a clustered distribution. [advance!] For the recovery of MVC-based architectures, we rely on a technique known as Supervised learning. This technique requires training data from which a classification function can be inferred. The results of this classification are layers of the MVC pattern.[advance!] To obtain the training data, we analyzed popular Java web development frameworks and representative applications developed with them. In particular, we relied on the Grails, Spring Roo, Play and Struts 2 frameworks, and on 17 sample applications included in their distributions. In total, we analyzed 619 source code artifacts, including java classes, jsp, css, and html files.[advance!] The analysis was conducted using the following 4 variables : ExternalAPI, Type, Suffix and MVC Layer. We classified the 619 source code artifacts by conducting a manual analysis of each artifact and by chosing appropriate values for their 4 associated variables.[advance!] The generation of the classification function was conducted using the Weka project, which is a tool that provides a collection of machine learning algorithms . For the current implementation, we chose the following configuration for the classification function: BayesNet classifier, Simple estimator and TAN search algorithm. [advance!] With this configuration, the following bayesian network is generated. It has a 87% effectivity, using the training set as a test option.However, our tool doesn’t depend on this particular classification function. Using Weka, we could create another probabilistic model based on the same training data. This provides a great flexibility to our tool.[advance!] The recovery of a Clustered-based architecture is simpler in comparisson to the MVC approach. [advance!] [advance!] We use the same 4 variables, but we don’t require a training set. [advance!] We rely on the execution of clustering algorithms through the Java-ML project, which is an open-source tool that includes the implementation of machine learning algorithms. In the current implementation, we rely on the Expectation-Maximization algorithm to identify clusters, but as with the MVC approach, we are not tied to this particular algorithm. We can change to other clustering algorithms using the Java-ML interface.
  • Now I’ll explain the details of the probabilistic model used by our tool
  • [advance!][advance!] To recover the architecture of a particular web system, its compiled artifacts are analyzed using the ASM framework, which is a library that allows the analysis of Java bytecode. [advance!] Each artifact is analyzed in order to assign values to the Type, ExternalAPI and Suffix variables. [advance!] Once they are assigned a value, this information is sent either to the Bayesian network or to the EM algorithm, [advance!] in order to classify into a MVC or Clustered-based architecture, respectively. The classification is executed using Weka and Java-ML. [advance!] Our tool uses the classification results to generate two architecture views. [advance!] The first one is a SVG file that depicts the classification of each artifact, using a color notation. [advance!] To generate this file we rely on the Graphviz project, which is an open source graph visualization project. [advance!] The second architecture view is an ADL document that contains the system components (that is MVC layers or clusters) and their proposed interactions. [advance!] This document is generated using templates included in our tool. These templates also include a set of proposed values for the quality metrics that are required by MexADL.
  • Now it’s time to demonstrate the implementation of our tool by using a sample scenario
  • Our tool is deployed as an open-source Eclipse plugin, that associates a context-menu to WAR files and Eclipse projects.[advance!] To demonstrate its use, I’m going to rely on the SpringSource Petclinic application, a classic example of Java web developmentIn particular, wearegoing to seehow to generatethe [advance!] SVG and [advance!] [advance!] ADL architectureviews[Tool demonstration, according to the following video steps: - Summary of {Develop web application – Generate WAR file} - Discover MVC architecture - Discover Cluster architecture - Generate MexADL artifacts][advance!] You can findmoreinformation in theprojectwebsite
  • I’m going to describe how the recovered architecture can help verify the maintainability of web systems.
  • [advance!] After each system compilation, two HTML reports are generated.[advance!] The first report contains the analysis of the quality metrics associated to the system components.[advance!] The second one, depicts violations to the expected interactions between system components.[Tool demonstration, according to the following video steps: - Configure project - Generate MexADL reports]
  • Conclusions and Future work
  • Conclusions
  • Future work
  • Research paperImplementation


  • 2. Outline2 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture generation ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 3. Introduction3 Architecture view Architecture view Architecture view Design Implementation Use doX() Case doY() Deployment Process Software architecture should guide The architecture is documented software development and maintenance in one ore more architecture views Architecture patterns are used as reference models for software solutions
  • 4. Problem4 1 Development of a traditional web application Controller 2 Use an architecture pattern as reference model (Ex: Model-View-Controller (MVC) ) Model View 3 Is this a view a model or a controller? Is anyone maintaining the Are we really following the MVC pattern? system documentation?
  • 5. Related work5 ¨  In [Corazza 2010] a probabilistic approach is proposed to partition software systems into meaningful sub-systems ¤  Analysis of variables, methods and class signatures ¤  This is a general approach and does not include historical data to train the probabilistic classifier ¨  In [Maqbool 2007] a Bayesian method is described to recover software systems architecture ¤  Use of a Naïve Bayes classifier, based on global variables ¤  Our approach considers a wider set of variables for the discovery of software architecture ¨  Software Architecture verification tools ¤  Klocwork Architect (http://www.klocwork.com) ¤  Structure 101 (http://www.headwaysoftware.com) A. Corazza, S. D. Martino, and G. Scanniello, “A probabilistic based approach towards software system clustering,” CSMR , 2010 O. Maqbool and H. Babri, “Bayesian learning for software architecture recovery,” ICEE, 2007
  • 6. Outline6 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture views ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 7. Objectives and contribution7 ¨  Objectives ¤  Recover software architecture for Java web systems n  MVC/Clustered based architecture n  Architecture Description Language (ADL) n  Scalable Vectors Graphic (SVG) ¤  Help verify their maintainability intent ¨  Contribution ¤  Probabilistic approach for the generation of architecture documentation based on MexADL
  • 8. Software Architecture8 Architecture definition Architecture Architecture view discovery ADL Architectural documentation MVC Analyzer Web2MexADL http://code.google.com/p/mvc-analyzer http://code.google.com/p/web2mexadl 1 2 Architecture verification Verification results Architecture ADL + verification Verification results Architectural documentation MexADL Source code http://code.google.com/p/mexadl Verification classes results 3
  • 9. MexADL9 MexADL Traditional software life cycle Definition Valid AOP inter-type Requirements interactions declarations ADL Source code description Expected Design metrics Valid Verification interactions AOP compile-time Implementation Valid weaving interactions verification Source Verification Source code code analysis Metrics verification Maintenance Expected metrics
  • 10. Outline10 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture views ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 11. Classification functions11 MVC-based architecture 1 Analyze training data 2 Generate a classification function Type Suffix MVC ExternalAPI MVC Layer + Weka = Layer External Type Suffix API BayesNet classifier Simple estimator 87% effectivity, 619 manually classified components TAN search 17 representative projects (Grails, Spring Roo, Play, Struts 2) using the training set as a test option Clustered-based architecture 1 Rely on clustering algorithms Java- Type ExternalAPI Suffix Cluster ID + ML Without training data Expectation-Maximization (EM) algorithm
  • 12. Outline12 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture views ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 13. Architecture views13 1 Classify web components Type Weka/Java-ML {java, jsp, xml, html, none} MVC layers/Clusters ASM ExternalAPI + {springmvc, aspectj, hibernate, jdbc, none} Expectation- Suffix Maximization Java bytecode Classification results {controller, service, validator, context, servlet, web, aspect, form, dao, manager, none} 2 Generate architecture views MVC layers/Clusters MVC layers/Clusters Cluster template MexADL + Graphviz = + = MVC SVG file template ADL document Classification results Classification results
  • 14. Outline14 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture views ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 15. Implementation15 ¨  Deployed as an open-source Eclipse plugin ¤  Context menu linked to WAR files and Eclipse Projects ¨  Sample application: SpringSource Petclinic 1 2 SVG 2 3 Quality MVC-based metrics ADL Clustered-based ADL Full details in: http://code.google.com/p/web2mexadl
  • 16. Outline16 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture views ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 17. Maintainability verification17 Quality metrics report After each compilation Valid interactions report Full details in:http://code.google.com/p/mexadl
  • 18. Outline18 ¨  Context and motivation ¨  Objectives and contribution ¨  Software architecture discovery ¤  Classificationfunctions ¤  Architecture views ¤  Implementation ¨  Maintainability verification ¨  Conclusions and future work
  • 19. Conclusions19 ¨  The effectivity of the probabilistic model is promising, though further validation is required ¨  The generated architecture can help verify the maintainability intent of software systems ¨  The approach is open to a variety of machine learning algorithms, thanks to the flexibility of the Weka and Java-ML projects ¨  Our implementation can be easily integrated with current development environments
  • 20. Future work20 ¨  To improve the classifier effectiveness, the bayesian network should be trained with a wider set of web projects ¨  Support additional languages and platforms ¨  Increased support for systems outside the web application domain
  • 21. References21 ¨  Research paper ¤  J. Castrejón, R. Lozano, and G. Vargas-Solar, “Web2MexADL: Discovery and Maintainability Verification of Software Systems Architecture,” CSMR 2012 - Tool Demonstration Track ¨  Implementation ¤  http://code.google.com/p/web2mexadl
  • 22. Questions22