Open source evolution analysis


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Open source evolution analysis

  1. 1. Open Source Evolution Analysis Izzat Alsmadi Kenneth Magel Department of computer science North Dakota state university {izzat.alsmadi, kenneth.magel} this model, the more we can tune its ABSTRACT assumptions and gain confident in its results. Source code analysis is important The paper analyzes the results of for software management. It enables us some source code charts from selected open recognize strengths and weaknesses in our source available projects. Those projects earlier projects or releases. We developed a will be studied through selected number of source code analysis tool. This tool gathers their releases. several metrics from C/C++, C# or Java source codes. In this paper, we will use the 2. RELATED WORK tool to analyze some of the open source code projects. We will study the selected projects Godfrey et al. studied the LOC release evolutions and compare some releases evolution for Linux[18]. characteristics between the same project Capiluppi et al. suggested that releases, as well as among different understandability decreases as time projects. Different programming language passes by, and focused on code and code and development styles will be studied through those open source projects. module sizes [19]. Stroulia et al. used CVSChecker to study temporal source General Terms code activities [20]. Marjanovic Source code analysis. proposed a meta model framework for Keywords code release history Systems [21]. Open source, code metrics. Scacchi studied the game open source development practices[22]. Raja et al. 1. INTRODUCTION explored some important software The knowledge gathered by characteristics that contribute to software metrics plays an important role consistent software quality in Linux in software management. This knowledge [23]. Moreno, et al. introduced Jeliot3 , can be used to build classification or as a visual source code evaluation proxy models that can be used toward tool[13]. Graver evaluated object future projects or releases. A software oriented refactoring process and metric tool helps us know the required evolution of a compiler[14]. information to build such models. The 3. SWMETRIC TOOL metrics that are gathered need to be SWMetric is a tool we developed to compiled to make some hypothesis and gather metrics on the function and the assumptions about the model. Using rules similar to those rules used in data mining class level. The tool can analyze code of classification models, the more we apply C/C++/C# or Java. It is originally22nd IEEE International Conference on Software Maintenance (ICSM06)0-7695-2354-4/06 $20.00 © 2006
  2. 2. developed as part of a student research size is used for declarations, method project for Honeywell aviation division. headers and global variable. 3. Release vs. LOC/function 4. GOALS AND APPROACHES Typically, programs should have a fixed size of functions of no more than 1. Study the relation between project (20-30) LOC’s[9]. releases and LOC size variation. We first studied how different projects LOC Some of the open source projects may have been developed by different changes with the releases. individuals. An increase of the functions sizes with time indicates problems in In traditional approaches, we planning that are causing functions to should not see much of new lines of expand or inflate. Of the studied projects, codes produced with the recent releases. two show a relatively fixed amount of The diagram, however, shows an LOC/function (~ 20) which may indicate instability of the amount of new LOC stable coding, better predictions and produced, most of the projects tested, management. The diagram shows a good show an increase in a release and a percent of projects that had and sustained a decrease in the next one. However, this fixed LOC/function. is expected in the case of open source 4. MCDC and nesting per function projects where there is much of MCDC and nesting are indications of how many decisions and nodes are there instability in terms of developers, their in each function. These are important values abilities, available time and other for software testing. resources. 5. Code distribution. We studied code 2. Study the LOC efficiency and distribution through dividing code into three Declaration Percent of the total code parts : comment lines, declaration and global size. Most of the projects studied have a variable lines, and the last part is the rest of steady LOC efficiency. This efficiency is the source code. almost the same for the different projects and it also does not vary for most of the 5.CONCLUSION AND FUTURE WORK time with releases progress. The We are willing to explore more efficiency percentage for most projects of the available open source codes in is between 70-80 %. order to be able to make some hypotheses and be able to build software Initially, in Software automation classification models. We will make development scenarios, we should have more detailed studies to compare a low efficiency, where most of the lines individual to company coding styles, and are counted in LOC , but not in SLOC, the common or different characteristics as they are automatically generated by a of coding among the different code generation tool. We should have an programming languages. increase for the efficiency with releases, 6. REFERENCES or increase for SLOC lines from the total 1. Free Software Foundation, 2006, 15 May LOC lines. 2006. <>. 2. Open, 02-2006, 15 May The Declaration percentage of 2006.>. data declarations to the total code size 3. Sun Microsystems, 05-2006. indicates how much of the total code <>.22nd IEEE International Conference on Software Maintenance (ICSM06)0-7695-2354-4/06 $20.00 © 2006