Identifying Reusable Components in Web Applications

Uploaded on

The growing market request for Web Applications is forcing software industries to produce applications under the pressure of a short time-to-market and a strong competition, with the consequence that …

The growing market request for Web Applications is forcing software industries to produce applications under the pressure of a short time-to-market and a strong competition, with the consequence that low quality and poor documented software is often produced. Maintaining, evolving or comprehending these applications are not straightforward tasks, and reverse engineering processes should be defined and validated to support them.
In this paper a reverse engineering approach for reconstructing an object-oriented conceptual model of the application domain of a Web Application is presented. The proposed approach defines a process that reconstructs the model in three steps. In each step, heuristic criteria exploiting source code analysis are used for the identification of objects and their relationships. Tools for implementing this method have been produced, and experiments for validating it have been carried out with the support of case studies. Experimental results showed the feasibility and the effectiveness of the proposed approach.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • In this talk, I will face the lack of reusability
  • The classic copy and paste We notice that C&P is often used during WA development, due also to ow know how of developers. So, we can find many software clones. We have two problems to solve: Identification of clones Reuse of clones
  • What is a clone?
  • What is a clone?
  • Fundamental distances are Levenshtein and Euclidean distance. These distances measure the distances between two vector of the same length. L. Distance is the minimum number of … L. Distance is a specialization of edit distance, with the same weight for each operation E. Distance is the geometric distance between two vectors, the square root of the sum of the squares of the differences between corrispondent elements of the two vectors
  • 4 metrics has been considered: STH is used to compare portions of client pages AMA is used to compare server script blocks
  • The metrics used to calculate Asp Metrics Array are …
  • Nella prima fase un’analisi statica viene fatta per ricavare i dati necessari al calcolo delle metriche: vettore dei tag HTML, etc. etc. Nella seconda fase si procede al confronto di ogni coppia di pagine, form, block, function, table, in cerca di cloni o near miss clone Nella terza fase la meaningfulness e la completeness dei candidati cloni sono esaminate da un utente Nella quarta fase, viene isolata in un template la parte di controllo costante di un cluster di cloni (o near miss clone). La struttura di tali template è salvata in un database repository
  • We define clone a set of components with zero distance. We define Near-miss-clone if this distance is lower than a fixed experimental threshold
  • The detected set of clones need to be submitted to a validation activity aiming at assessing if they can be actually considered reusable components or not. Business domain knowledge and experience in building Web applications will be required for carrying out the validation activity.
  • Each set of validated clones is called cluster. In figure, on the left side, we see a client page whose structure was cloned. On the right side, we see its template, where each data has been substituted with a dummy element
  • We can operate a reengineering of cluster of cloned components
  • We considered only exact clones


  • 1. Identifying ReusableComponents in Web Applications P. Tramontana U. De Carlini, A.R. Fasolino Dipartimento di Informatica e Sistemistica University of Naples Federico II, Italy G.A. Di LuccaRCOST – Research Centre on Software Technology University of Sannio, Benevento, Italy
  • 2. Outline• Reusability in Web Applications• Clones• Clone Analysis Process• An experiment
  • 3. Web Applications (WA): problems and open issues• Basic software engineering principles,such as modularity, encapsulation, orseparation of concerns cannot be realizedwith some Web Applicationimplementation technologies Maintaining and evolving these systemsare difficult and expensive tasks.
  • 4. Web Applications (WA): problems and open issues• HTML language does not encourage the information content to be separated from its rendering• Server and Client scripting languages don’t encourage separation between Layout, Content and Business Rules
  • 5. Web Applications (WA): problems and open issues• Practice of developing new applicationsby cloning existing pieces of code isfrequently adopted Web applications provide a valuablesource of reusable software, which can beused with success to support theirmaintenance, reengineering and evolution.
  • 6. Clone Definition Clones: Duplicated or similar portions of code in software artifacts  What kind of portions of code?  How measure similarity between these portions of code?
  • 7. What kind of portions of code?Degree of granularity Type of cloneof a cloneWeb Page Client Page Server PageClient Page inner components Script Block Script Function HTML Form HTML tableServer Page inner Script Blockcomponents Script Function
  • 8. Comparing source code•Levenshtein distance: Minimum number of insertions, deletions, and replacements of elements necessary to make two vectors equal• Euclidean distance
  • 9. Distance between portions of codeTechnique Description STH Strings of HTML tags extracted from client pages are compared by the Levenshtein distance. CTH Counts of HTML tags extracted from client pages are compared by the Euclidean distance. SOA Strings of ASP built-in objects extracted from server pages are compared by the Levenshtein distance. AMA An array of software metrics extracted from ASP script blocks in server pages is used to compare server pages by the Euclidean distance.
  • 10. Metrics for AMA   Software MetricsNTC Total Number of characters per pageNTRC Total Number of lines of code per pageNTBC Total Number of ASP code blocks per pageNTSD Total Number of declarative statements per pageNTSC Total Number of conditional statements per pageNTSI Total Number of cyclic statements per pageNTF Total Number of Functions per pageNTS Total Number of Subroutines per pageNTOP Total Number of built-in ASP objects per pageLDBC Levenshtein distance of ASP code blocks
  • 11. Phases of Clone Analysis Process1. Separation of control and data component within Web pages2. Clone Detection3. Clone Validation4. Template Generation5. Repository population 
  • 12. Separation of control and data component• Structure and data component are separatedin HTML portions of code• ASP block code is transformed removingblank characters and comments from the line,and substituting each data element (such asconstants, or identifiers) by a dummy one.• Consecutive blocks of ASP code will bemerged in a single comprehensive block.
  • 13. Clone Detection•Metrics are calculated for eachcomponent• Clone analysis techniques are applied•Clone and near-miss-clone areautomatically detected with the supportof CloneDetector tool
  • 14. Clone ValidationClones (and near-miss-clones) are assessed by anexpert, considering:•meaningfulness (does the considered cloneimplement a meaningful abstraction in the businessdomain, or in the solution domain?)•completeness (does the clone include all the lines ofcode necessary for implementing a valid abstraction?).Clones may be discarded, merged or accepted asreusable components.
  • 15. Template generation For each validated cluster of clones, a Template willbe derived
  • 16. Repository populationTemplate information are stored in a repository,together with component information Template 1 1 CloneCluster DataComponent 0..* Dummy 1..* WebPage ControlComponent follows 1 1 Client Server HTMLTag Statement * * PageSubComponent TagAttribute {incomplete} Form Table Script Function Subroutine
  • 17. ExperimentAn experiment has been carried out, with 4 real webapplicationsWe used CloneDetector tool to search exact clonesof:  HTML pages (STH technique)  client scripts (STH technique)  functions in client scripts (STH technique)  forms (STH technique)  tables (STH technique)  ASP pages (AMA technique)  ASP script blocks (AMA technique)  functions/subroutines in ASP pages (AMA technique)
  • 18. Results WA1 WA2 WA3 WA4# HTML pages 55 23 23 -# cloned HTML pages 18 2 8 -# clusters of cloned HTML pages 3 1 4 -# scripts in client pages 3 12 10 -# cloned client scripts 3 4 2 -# cluste rs of cloned client scripts 1 2 1 -# functions in client pages 12 10 10 -# cloned client functions 12 4 2 -# clusters of cloned client functions 4 2 1 -# forms in client pages 1 8 10 -# cloned client forms 0 2 9 -# clusters of clo ned client forms 0 1 3 -# tables in client pages 7 4 3 -# cloned client tables 0 0 0 -# clusters of cloned client tables 0 0 0 -# ASP pages 19 73 37 71# cloned ASP pages 2 15 3 2# clusters of cloned ASP pages 1 6 1 1# scripts in ASP pages 75 576 150 5341# cloned ASP scripts 4 0 36 0# clusters of cloned ASP scripts 2 0 3 0# functions/subroutines in ASP pages 5 0 3 165# cloned ASP functions/subroutines 0 0 0 9# clusters of cloned ASP 0 0 0 3functions/subroutines
  • 19. Conclusions A clone detection method has been presented An experiment has been carried out Many cluster of cloned artifact has been detected andvalidated Some reusable component has been abstracted astemplates Reengineering of static pages in a XML/XSL architecturemay be performed Reuse of components or pages may be performed