SlideShare a Scribd company logo
NEW TRENDS IN LEARNING FOR
SOFTWARE ENGINEERING
Alaa Hamouda
Department of Computer Engineering,
Engineering Faculty,
Al-Azhar University, Egypt
1
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
2
Problem Definition
• There is a need to meet the challenge of
developing and maintaining large and complex
software systems.
• Machine learning methods have been playing
an increasingly important role in many
software development and maintenance
tasks.
3
SWE Phases
4
Overview of ML
• Machine learning methods fall into the following
broad categories: supervised learning and
unsupervised learning. Supervised learning deals
with learning a target function from labeled
examples. Unsupervised learning attempts to
learn patterns and associations from a set of
objects that do not have attached class labels.
• Supervised learning can be divided into eager and
lazy classifiers
5
Overview of ML
6
Overview of ML
7
8
The loan data (reproduced)
Approved or not
9
A decision tree from the loan data
Decision nodes and leaf nodes (classes)
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
10
Project Planning
• The statistics report failure rate of 70% for the
software
• The cost overrun has been indicated 189%
• The researches show that inaccurate
estimation is the root factor of fail in the most
software project fails.
11
Size Estimation
• Size -- Effort - Cost
• twenty-eight out of the collected sixty
publications (almost 47%) deal with the issue
of how to build models to predict or estimate
certain property of software development
process or artifacts.
12
Function Point
13
Internal Logical File: File accessed and maintained by the application under
development
External Interface File: File accessed by the Processing Logic, but maintained
by another application
External Input: An elementary process that processes data that comes from
outside the application boundary.
–Maintains ILF
External Output: An elementary process that sends data outside the
application boundary.
-EO represents information to user through processing logic in addition to
retrieval of data
External in Query: An elementary process that sends data outside the
application boundary
-EQ presents information to a user through retrieval of data from ILF/EIF.
-No data manipulation or processing logic.
Size estimation (Cont’)
Input:
• Function points
• Project domains
• Number of components types:
– Number of menu components
– Number of inputs components
– Number of output components
ML Algorithm:
• Neural Network
Output:
• LOC to be fed to the cost estimation stage
14
Size estimation (Cont’)
15
Effort Estimation
Input:
• Line of Code (generated from the size estimation)
• Scale factors
• Cost Drivers
Algorithm:
• Fuzzy Inference Engine
Output:
• Estimated efforts (e.g. man-hours)
16
Inputs (scale factors)
Factor Explanation
Precedentedness
(PREC)
Reflects the previous experience of the
organization
Development
Flexibility (FLEX)
Reflects the degree of flexibility in the
development process.
Risk Resolution (RESL) Reflects the extent of risk analysis carried out.
Team Cohesion (TEAM) Reflects how well the development team knows
each other and work together.
Process maturity (PMAT) Reflects the process maturity of the organization.
17
Factor Explanation
LOC Line of Code
Inputs (Cost Drivers)
Attribute Type Description
RELY Product Required system reliability
CPLX Product Complexity of system modules
DOCU Product Extent of documentation required
DATA Product Size of database used
RUSE Product Required percentage of reusable components
TIME Computer Execution time constraint
PVOL Computer Volatility of development platform
STOR Computer Memory constraints
ACAP Personnel Capability of project analysts
PCON Personnel Personnel continuity
PCAP Personnel Programmer capability
PEXP Personnel Programmer experience in project domain
AEXP Personnel Analyst experience in project domain
LTEX Personnel Language and tool experience
TOOL Project Use of software tools
SCED Project Development schedule compression
SITE Project Extent of multisite working and quality of inter-
site communications 18
Using Fuzzy Logic
19
Effort Estimation directly from UCP
In the previous method:
• FP (size) -- > LOC (size) -- > Effort
Another method:
• UCP (size) -- > Effort (directly)
20
Effort Estimation
21
Use Case Point Calculation
22
Productivity
23
Project Complexity
• Level 1: the project team is familiar with this type of
project and the team has developed similar projects in
the past. The number and type of interfaces are simple.
The project will be installed in normal conditions
where high security or safety factors are not required.
Moreover, Level 1 projects are those of which around
20% of their design or implementation parts are reused
(came from old similar projects).
• Level 2: This is similar to level1 category with a
difference that only about 10% of these projects are
reused.
24
Project Complexity (Cont’d)
• Level 3: the technology, interface, installation conditions
are normal. Furthermore, no parts of the projects had been
previously designed or implemented.
• Level 4: the project is required to be installed on a
complicated topology/architecture such as distributed
systems. Moreover, in this level, the number of variables
and interface is large.
• Level 5: This is similar to Level4 but with additional
constraints such as a special type of security or high safety
factors.
25
Effort Estimation
26
Effort Estimation (Cont’d)
The results show that the proposed ANN model
outperforms:
• Regression models by 8%
• UCP models by 50%
27
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
28
Requirements Analysis
29
Business Analysis  System Analysis
Requirements Analysis
30
Requirements Analysis
31
Requirements Analysis
Lexicons Phase-I Phase –II
User Noun Actor
fills Verb Action
the Article -------
form Noun Object
32
Requirements
• Reverse engineering where we have legacy systems
that are critical to the operation of an organization
which uses them and that must still be maintained.
• Most legacy systems were developed before software
engineering techniques were widely used. Thus they
may be poorly structured and their documentation
may be either out-of-date or non-existent.
• In order to bring to bear the legacy system
maintenance, the first task is to recover the design or
specification of a legacy system from its source or
executable code
33
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
34
Design
1. Finding Fault Prone components for reuse
2. UI Design
35
Components Re-use
• Software quality classification models can be
used to indicate which program modules are
fault-prone (FP) and not fault-prone (NFP).
• These models can be used to select the best
candidate modules.
36
Components Re-use
Attribute
U_1 Number of unique operators
N_1 Total number of operators
U_2 Number of unique operands
N_2 Total number of operands
V(G) McCabe’s cyclomatic complexity
N_L Number of logical operators
LOC Lines of code
ELOC Executable lines of code
37
User Interface Design
• Learnability is an important aspect of usability
• users lose up to 40% of their time due to
“frustrating experiences” with computers,
with one of the most common causes of these
frustrations being missing, hard to find, and
unusable features of the software.
38
User Interface Design
• Nielsen defines that a highly learnable system
could be categorized as “allowing users to
reach a reasonable level of usage proficiency
within a short time”.
• Web usage map is mined through Label
Sequential Rule
39
User Interface Design
40
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
41
Implementation
• Implementation is a core process in the software
engineering life cycle.
• One of the challenges in this phase is the
modularization –or remodularization-.
• Genetic algorithms have been successfully used
to address this problem.
• The objective is to improve the module quality
(MQ). All versions of MQ are combinations of
cohesion and coupling into a single weighted
fitness function.
42
Implementation (Cont’d)
• Clustering has also been applied to package
coupling, to reduce overall package size and to
explore the relationship between design and
code level software structure.
• Additional objectives might include closeness to
original module structure, business goals,
technical constraints, testability, and other
metrics that may be important in finding a good
module structure.
43
Implementation (Cont’d)
44
Implementation (Cont’d)
• Refactoring is to rewrite existing source code in
order to improve its readability, reusability or
structure without affecting its meaning or
behavior.
• For project managers it is interesting to know
which locations are likely to demand refactoring.
Refactoring improves the understandability of the
code, but on the other hand requires
development time
45
Implementation (Cont’d)
• Researches screen evolution data from
versioning systems of open source projects.
• ArgoUML and the Spring framework are
examples developed in Java and consist of
5000 and 10000 classes each.
• Each class is usually placed in a separate file in
Java, thus they use files equivalent to classes
and focus on files for our analysis.
46
Implementation (Cont’d)
The used features can be divided into different
categories:
• Size
This category contains size measures such as
lines of code from an evolution perspective:
linesAdded, linesModified, or linesDeleted
relative to the total LOC (lines of code) of a file.
47
Implementation (Cont’d)
• Team
The number of authors of files influences the way
software is developed. It is expected that the more
authors are working on the changes the higher the
probability of rework and mistakes.
• Complexity of existing solution
According to the laws of software evolution, software
continuously becomes more and more complex. Changes
are more difficult to add as the software is more difficult
to understand and the contracts between existing parts
have to retain. As a result they investigate the
changeCount in relation to the number of changes during
the entire history of each file.
48
Implementation (Cont’d)
• New Requirements
In software development projects usually new
classes are added to object-oriented systems when
new requirements have to be satisfied. They use
the information whether a file was newly
introduced during the prediction period
• Relational Aspects
One of the most important features of this category
are couplings such as the number of
changes/revisions where other files have been
committed with.
49
Implementation (Cont’d)
• With the described features, the number of
refactorings is predicted
50
Implementation (Cont’d)
• Decision tree and neural network are used as
classifiers.
• The F-measure was about 65%.
• It is clear that several features such as lines
activity rate and number of lines altered per
commit provide much information for the
assessment of refactorings.
• But also the structure of the system is crucial for
refactorings, as the number of co-changed files
and the number of files introduced during the
maintenance are relevant features.
51
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
52
Testing
• Software quality models help ensure the
reliability of the delivered products.
• Early detection of fault-prone software
components enables verification experts to
concentrate their time and resources on the
problem areas of the software system under
development.
• Accurate prediction of fault-prone modules
enables the verification and validation activities
focused on the critical software components.
53
Testing (Cont’d)
54
Testing (Cont’d)
• Decision trees correctly predicted 79.3% of
high development effort fault-prone modules
(detection rate), while the trees generated
from the best parameter combinations
correctly identified 88.4% of those modules
on the average.
55
Agenda
• Introduction
• Software Engineering Phases
• Machine Learning Overview
• Applications of ML in SWE with each process:
– Project Planning
– Requirements
– Design
– Implementation
– Testing
– Maintenance
• Conclusion
56
Maintenance
• Software maintenance is widely recognized to be
the most expensive and time-consuming aspect
of the software process.
• A relevance relation maps a tuple of system
elements to a value indicating how related they
are.
• These software change repositories reflect a
history of the system, which includes actions that
result in the creation of new relationships and the
strengthening of the existing relationships in the
software.
57
Maintenance (Cont’d)
58
Maintenance (Cont’d)
• Software entities include documents, source files,
routines, modules, variables, and even the entire
software system.
• A relevance relation is a predictor that maps
tuples of two or more software entities to a value
r quantifying how relevant, that is, connected or
related, the entities are to each other.
• r shows the strength of relevance among the
entities.
59
Maintenance (Cont’d)
60
Maintenance Effort Prediction
• If the predictions are based on formal software
development effort prediction models, such as the
estimation part of the Function Point Analysis, essential
differences in characteristics between software
development and software maintenance are neglected
• The focus of software development is the creation of
software, but the focus of software maintenance is more
the change of software.
• The development of a software application typically is a
one-of-a-kind project, but the maintenance activities on an
application usually comprise a large number of tasks
carried out over a long period of time in a relatively stable
environment.
62
Maintenance Effort Prediction
• Some researches collected data on:
– 109 randomly selected maintenance tasks
– 70 applications
– The size of the applications varied from a few
thousand lines of code (LOC) to about 500,000 LOC
– the age of the applications varied from less than a
year to more than 20 years
– The functions of the applications included payroll,
order entry, billing and invoicing, inventory control,
service management, and personnel administration.
63
Maintenance Effort Prediction
The following data was collected for each maintenance task:
• Type of maintenance task, i.e., corrective or perfective.
• Priority of task, i.e., high, medium or low priority.
• Maintainer’s knowledge and confidence about how to solve
the task immediately after having read or heard the task
specification.
• Years of experience as maintainer, and on the maintained
application.
• Education level of the maintainer.
• Work-hours (effort) spent on the task.
• Task size and the programming language
• Age and size of the changed application.
64
Maintenance Effort Prediction
Most Important features:
• Cause: Corrective maintenance = 0, otherwise = 1
• Change: More than 50% of the effort is believed to be
spent on updating of code compared to inserting and
deleting the code = 0, otherwise = l
• Mode: More than 50% of the effort is believed to be
spent on development of new modules (New module
mode) = 0, otherwise (Embedded mode) = 1
• Confidence: The maintainer believes he knows how to
solve the task when the task specification is read/heard
the first time = 0 (High confidence), otherwise = 1
(Medium or low confidence).
65
Maintenance Effort Prediction
Less effect features:
• Type of language
• Maintainer experience
• Task priority
• Application age
• Application size
66
Maintenance Effort Prediction
• Neural network and regression were used as
approaches for effort prediction.
• The prediction accuracy was acceptable (error of 60%).
• A recommended use of an effort prediction model is,
therefore, to support the expert predictions.
• Another important use of a formal prediction model
may be to support the collection and analysis of
maintenance data in order to enable improvement of
the maintenance process and product.
67
Open Problems
• Most of presented work are immature and a
lot of related issues are still open.
• Machine learning can help in the
requirements engineering phase in developing
knowledge based systems and ontologies to
manage the requirements and model problem
domains
68
Open Problems (Cont’d)
• One of the most difficult problems is the
problem of transforming requirements into
architectures. Much research is needed in this
area to address the ever increasing complexity
of functional and non-functional
requirements.
69
Open Problems (Cont’d)
• One area that has received some attention is
the use of automated algorithms with
machine learning to make repair assignments.
• In any case, more studies with respect to the
appropriate criteria for selecting assignment
policy, reward mechanisms and management
goals need to be undertaken.
70
Conclusion
• The existing work certainly proves that the
field of software engineering is a fertile
ground for the application of machine learning
methods.
• It is clear that there is an increased interest in
the niche area of machine learning and
software engineering.
71
Conclusion (cont’d)
• The strength of machine learning methods lies
in the fact that they have sound mathematical
and logical justifications
• The power of machine learning methods does
not come from a particular induction method,
but instead from proper formulation of the
problems and from crafting the representation
to make learning tractable.
72
Conclusion (cont’d)
• Machine learning can play a good role in the
different phases of software engineering; project
planning, requirements analysis, design,
implementation, testing, and even in maintenance
• It is expected that this interest in applying
machine learning in software engineering tasks
will increase significantly especially with the
increase interest in the empirical software
engineering.
73
Thank you very much
74

More Related Content

What's hot

Software Engineering Methodologies
Software Engineering MethodologiesSoftware Engineering Methodologies
Software Engineering MethodologiesDamian T. Gordon
 
System Design and Analysis 1
System Design and Analysis 1System Design and Analysis 1
System Design and Analysis 1Boeun Tim
 
Software Generic Design Process.
Software Generic Design Process.Software Generic Design Process.
Software Generic Design Process.
Syed Hassan Ali
 
Unified process Model
Unified process ModelUnified process Model
Unified process Model
University of Haripur
 
Software Re-engineering Forward & Reverse Engineering
Software Re-engineering Forward & Reverse EngineeringSoftware Re-engineering Forward & Reverse Engineering
Software Re-engineering Forward & Reverse Engineering
Ali Raza
 
Dynamic Systems Development Method (DSDM) - Agile
Dynamic Systems Development Method (DSDM) - AgileDynamic Systems Development Method (DSDM) - Agile
Dynamic Systems Development Method (DSDM) - Agile
Maruf Abdullah (Rion)
 
Ian Sommerville, Software Engineering, 9th Edition Ch1
Ian Sommerville,  Software Engineering, 9th Edition Ch1Ian Sommerville,  Software Engineering, 9th Edition Ch1
Ian Sommerville, Software Engineering, 9th Edition Ch1
Mohammed Romi
 
Introduction to software engineering
Introduction to software engineeringIntroduction to software engineering
Introduction to software engineering
Hitesh Mohapatra
 
Software Size Estimation
Software Size EstimationSoftware Size Estimation
Software Size Estimation
Muhammad Asim
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptx
ubaidullah75790
 
Software Engineering Process Models
Software Engineering Process Models Software Engineering Process Models
Software Engineering Process Models
Satya P. Joshi
 
Rational Unified Process
Rational Unified ProcessRational Unified Process
Rational Unified Process
Omkar Dash
 
Component based software engineering
Component based software engineeringComponent based software engineering
Component based software engineering
Charotar University Of Science And Technology,Gujrat
 
System analysis and design
System analysis and design System analysis and design
System analysis and design Razan Al Ryalat
 
Ch2-Software Engineering 9
Ch2-Software Engineering 9Ch2-Software Engineering 9
Ch2-Software Engineering 9Ian Sommerville
 
Extreme programming (xp)
Extreme programming (xp)Extreme programming (xp)
Extreme programming (xp)
Mohamed Abdelrahman
 
Pressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metricsPressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metrics
Seema Kamble
 
System engineering
System engineeringSystem engineering
System engineering
Lisa Elisa
 
Ch04 agile development models
Ch04 agile development modelsCh04 agile development models
Ch04 agile development models
Noor Ul Hudda Memon
 
Software Engineering - Ch7
Software Engineering - Ch7Software Engineering - Ch7
Software Engineering - Ch7Siddharth Ayer
 

What's hot (20)

Software Engineering Methodologies
Software Engineering MethodologiesSoftware Engineering Methodologies
Software Engineering Methodologies
 
System Design and Analysis 1
System Design and Analysis 1System Design and Analysis 1
System Design and Analysis 1
 
Software Generic Design Process.
Software Generic Design Process.Software Generic Design Process.
Software Generic Design Process.
 
Unified process Model
Unified process ModelUnified process Model
Unified process Model
 
Software Re-engineering Forward & Reverse Engineering
Software Re-engineering Forward & Reverse EngineeringSoftware Re-engineering Forward & Reverse Engineering
Software Re-engineering Forward & Reverse Engineering
 
Dynamic Systems Development Method (DSDM) - Agile
Dynamic Systems Development Method (DSDM) - AgileDynamic Systems Development Method (DSDM) - Agile
Dynamic Systems Development Method (DSDM) - Agile
 
Ian Sommerville, Software Engineering, 9th Edition Ch1
Ian Sommerville,  Software Engineering, 9th Edition Ch1Ian Sommerville,  Software Engineering, 9th Edition Ch1
Ian Sommerville, Software Engineering, 9th Edition Ch1
 
Introduction to software engineering
Introduction to software engineeringIntroduction to software engineering
Introduction to software engineering
 
Software Size Estimation
Software Size EstimationSoftware Size Estimation
Software Size Estimation
 
Software Measurement and Metrics.pptx
Software Measurement and Metrics.pptxSoftware Measurement and Metrics.pptx
Software Measurement and Metrics.pptx
 
Software Engineering Process Models
Software Engineering Process Models Software Engineering Process Models
Software Engineering Process Models
 
Rational Unified Process
Rational Unified ProcessRational Unified Process
Rational Unified Process
 
Component based software engineering
Component based software engineeringComponent based software engineering
Component based software engineering
 
System analysis and design
System analysis and design System analysis and design
System analysis and design
 
Ch2-Software Engineering 9
Ch2-Software Engineering 9Ch2-Software Engineering 9
Ch2-Software Engineering 9
 
Extreme programming (xp)
Extreme programming (xp)Extreme programming (xp)
Extreme programming (xp)
 
Pressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metricsPressman ch-22-process-and-project-metrics
Pressman ch-22-process-and-project-metrics
 
System engineering
System engineeringSystem engineering
System engineering
 
Ch04 agile development models
Ch04 agile development modelsCh04 agile development models
Ch04 agile development models
 
Software Engineering - Ch7
Software Engineering - Ch7Software Engineering - Ch7
Software Engineering - Ch7
 

Viewers also liked

Software Engineering
Software EngineeringSoftware Engineering
Software Engineeringpoonam.rwalia
 
Predictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterPredictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterThe Marketing Distillery
 
Using the Machine to predict Testability
Using the Machine to predict TestabilityUsing the Machine to predict Testability
Using the Machine to predict Testability
Miguel Lopez
 
What Every Software Engineer Should Know About Machine Learning - Peter Norvig
What Every Software Engineer Should Know About Machine Learning - Peter NorvigWhat Every Software Engineer Should Know About Machine Learning - Peter Norvig
What Every Software Engineer Should Know About Machine Learning - Peter Norvig
WithTheBest
 
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
Tom Kleingarn
 
Defect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft WebinarDefect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft Webinar
XBOSoft
 
Autosar software component
Autosar software componentAutosar software component
Autosar software component
Farzad Sadeghi
 
Machine learning in software testing
Machine learning in software testingMachine learning in software testing
Machine learning in software testing
Thoughtworks
 
Automated testing of software applications using machine learning edited
Automated testing of software applications using machine learning   editedAutomated testing of software applications using machine learning   edited
Automated testing of software applications using machine learning edited
Milind Kelkar
 

Viewers also liked (10)

Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
 
Predictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterPredictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotter
 
Using the Machine to predict Testability
Using the Machine to predict TestabilityUsing the Machine to predict Testability
Using the Machine to predict Testability
 
What Every Software Engineer Should Know About Machine Learning - Peter Norvig
What Every Software Engineer Should Know About Machine Learning - Peter NorvigWhat Every Software Engineer Should Know About Machine Learning - Peter Norvig
What Every Software Engineer Should Know About Machine Learning - Peter Norvig
 
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
Predictive Performance Testing: Integrating Statistical Tests into Agile Deve...
 
History, classification and components of computers
History, classification and components of computersHistory, classification and components of computers
History, classification and components of computers
 
Defect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft WebinarDefect Prevention & Predictive Analytics - XBOSoft Webinar
Defect Prevention & Predictive Analytics - XBOSoft Webinar
 
Autosar software component
Autosar software componentAutosar software component
Autosar software component
 
Machine learning in software testing
Machine learning in software testingMachine learning in software testing
Machine learning in software testing
 
Automated testing of software applications using machine learning edited
Automated testing of software applications using machine learning   editedAutomated testing of software applications using machine learning   edited
Automated testing of software applications using machine learning edited
 

Similar to Machine Learning in Software Engineering

Pressman ch-3-prescriptive-process-models
Pressman ch-3-prescriptive-process-modelsPressman ch-3-prescriptive-process-models
Pressman ch-3-prescriptive-process-models
Noor Ul Hudda Memon
 
SE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentSE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it student
RAVALCHIRAG1
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notesSiva Ayyakutti
 
process models- software engineering
process models- software engineeringprocess models- software engineering
process models- software engineeringArun Nair
 
SE_Module1new.ppt
SE_Module1new.pptSE_Module1new.ppt
SE_Module1new.ppt
ADARSHN40
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineering
moduledesign
 
Software engineering
Software engineeringSoftware engineering
Software engineering
nimmik4u
 
Unit 1
Unit 1Unit 1
Unit 1
shalinik57
 
Chapter 3.pptx
Chapter 3.pptxChapter 3.pptx
Chapter 3.pptx
KUMKUMOKUSSIA
 
Software Engineering- Crisis and Process Models
Software Engineering- Crisis and Process ModelsSoftware Engineering- Crisis and Process Models
Software Engineering- Crisis and Process Models
Nishu Rastogi
 
Software Development Life Cycle
Software Development Life CycleSoftware Development Life Cycle
Software Development Life Cycle
university of education,Lahore
 
System Development Life Cycle Overview.ppt
System Development Life Cycle Overview.pptSystem Development Life Cycle Overview.ppt
System Development Life Cycle Overview.ppt
KENNEDYDONATO1
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineering
moduledesign
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
UnknownPerson201264
 
2. Software process
2. Software process2. Software process
2. Software process
Ashis Kumar Chanda
 
Software Engineering.ppt
Software Engineering.pptSoftware Engineering.ppt
Software Engineering.ppt
HODCOMPUTER10
 
Lecture 7.pptx
Lecture 7.pptxLecture 7.pptx
Lecture 7.pptx
MohammedMohammed578197
 
software Engineering process
software Engineering processsoftware Engineering process
software Engineering process
Raheel Aslam
 

Similar to Machine Learning in Software Engineering (20)

Pressman ch-3-prescriptive-process-models
Pressman ch-3-prescriptive-process-modelsPressman ch-3-prescriptive-process-models
Pressman ch-3-prescriptive-process-models
 
SE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentSE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it student
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notes
 
process models- software engineering
process models- software engineeringprocess models- software engineering
process models- software engineering
 
SE_Module1new.ppt
SE_Module1new.pptSE_Module1new.ppt
SE_Module1new.ppt
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineering
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 
Unit 1
Unit 1Unit 1
Unit 1
 
Chapter 3.pptx
Chapter 3.pptxChapter 3.pptx
Chapter 3.pptx
 
Software Engineering- Crisis and Process Models
Software Engineering- Crisis and Process ModelsSoftware Engineering- Crisis and Process Models
Software Engineering- Crisis and Process Models
 
Sdlc 4
Sdlc 4Sdlc 4
Sdlc 4
 
Software Development Life Cycle
Software Development Life CycleSoftware Development Life Cycle
Software Development Life Cycle
 
System Development Life Cycle Overview.ppt
System Development Life Cycle Overview.pptSystem Development Life Cycle Overview.ppt
System Development Life Cycle Overview.ppt
 
Lecture 3 software_engineering
Lecture 3 software_engineeringLecture 3 software_engineering
Lecture 3 software_engineering
 
Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
 
Scope of software engineering
Scope of software engineeringScope of software engineering
Scope of software engineering
 
2. Software process
2. Software process2. Software process
2. Software process
 
Software Engineering.ppt
Software Engineering.pptSoftware Engineering.ppt
Software Engineering.ppt
 
Lecture 7.pptx
Lecture 7.pptxLecture 7.pptx
Lecture 7.pptx
 
software Engineering process
software Engineering processsoftware Engineering process
software Engineering process
 

Recently uploaded

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
BrazilAccount1
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
SupreethSP4
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 

Recently uploaded (20)

Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
English lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdfEnglish lab ppt no titlespecENG PPTt.pdf
English lab ppt no titlespecENG PPTt.pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Runway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptxRunway Orientation Based on the Wind Rose Diagram.pptx
Runway Orientation Based on the Wind Rose Diagram.pptx
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 

Machine Learning in Software Engineering

  • 1. NEW TRENDS IN LEARNING FOR SOFTWARE ENGINEERING Alaa Hamouda Department of Computer Engineering, Engineering Faculty, Al-Azhar University, Egypt 1
  • 2. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 2
  • 3. Problem Definition • There is a need to meet the challenge of developing and maintaining large and complex software systems. • Machine learning methods have been playing an increasingly important role in many software development and maintenance tasks. 3
  • 5. Overview of ML • Machine learning methods fall into the following broad categories: supervised learning and unsupervised learning. Supervised learning deals with learning a target function from labeled examples. Unsupervised learning attempts to learn patterns and associations from a set of objects that do not have attached class labels. • Supervised learning can be divided into eager and lazy classifiers 5
  • 8. 8 The loan data (reproduced) Approved or not
  • 9. 9 A decision tree from the loan data Decision nodes and leaf nodes (classes)
  • 10. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 10
  • 11. Project Planning • The statistics report failure rate of 70% for the software • The cost overrun has been indicated 189% • The researches show that inaccurate estimation is the root factor of fail in the most software project fails. 11
  • 12. Size Estimation • Size -- Effort - Cost • twenty-eight out of the collected sixty publications (almost 47%) deal with the issue of how to build models to predict or estimate certain property of software development process or artifacts. 12
  • 13. Function Point 13 Internal Logical File: File accessed and maintained by the application under development External Interface File: File accessed by the Processing Logic, but maintained by another application External Input: An elementary process that processes data that comes from outside the application boundary. –Maintains ILF External Output: An elementary process that sends data outside the application boundary. -EO represents information to user through processing logic in addition to retrieval of data External in Query: An elementary process that sends data outside the application boundary -EQ presents information to a user through retrieval of data from ILF/EIF. -No data manipulation or processing logic.
  • 14. Size estimation (Cont’) Input: • Function points • Project domains • Number of components types: – Number of menu components – Number of inputs components – Number of output components ML Algorithm: • Neural Network Output: • LOC to be fed to the cost estimation stage 14
  • 16. Effort Estimation Input: • Line of Code (generated from the size estimation) • Scale factors • Cost Drivers Algorithm: • Fuzzy Inference Engine Output: • Estimated efforts (e.g. man-hours) 16
  • 17. Inputs (scale factors) Factor Explanation Precedentedness (PREC) Reflects the previous experience of the organization Development Flexibility (FLEX) Reflects the degree of flexibility in the development process. Risk Resolution (RESL) Reflects the extent of risk analysis carried out. Team Cohesion (TEAM) Reflects how well the development team knows each other and work together. Process maturity (PMAT) Reflects the process maturity of the organization. 17 Factor Explanation LOC Line of Code
  • 18. Inputs (Cost Drivers) Attribute Type Description RELY Product Required system reliability CPLX Product Complexity of system modules DOCU Product Extent of documentation required DATA Product Size of database used RUSE Product Required percentage of reusable components TIME Computer Execution time constraint PVOL Computer Volatility of development platform STOR Computer Memory constraints ACAP Personnel Capability of project analysts PCON Personnel Personnel continuity PCAP Personnel Programmer capability PEXP Personnel Programmer experience in project domain AEXP Personnel Analyst experience in project domain LTEX Personnel Language and tool experience TOOL Project Use of software tools SCED Project Development schedule compression SITE Project Extent of multisite working and quality of inter- site communications 18
  • 20. Effort Estimation directly from UCP In the previous method: • FP (size) -- > LOC (size) -- > Effort Another method: • UCP (size) -- > Effort (directly) 20
  • 22. Use Case Point Calculation 22
  • 24. Project Complexity • Level 1: the project team is familiar with this type of project and the team has developed similar projects in the past. The number and type of interfaces are simple. The project will be installed in normal conditions where high security or safety factors are not required. Moreover, Level 1 projects are those of which around 20% of their design or implementation parts are reused (came from old similar projects). • Level 2: This is similar to level1 category with a difference that only about 10% of these projects are reused. 24
  • 25. Project Complexity (Cont’d) • Level 3: the technology, interface, installation conditions are normal. Furthermore, no parts of the projects had been previously designed or implemented. • Level 4: the project is required to be installed on a complicated topology/architecture such as distributed systems. Moreover, in this level, the number of variables and interface is large. • Level 5: This is similar to Level4 but with additional constraints such as a special type of security or high safety factors. 25
  • 27. Effort Estimation (Cont’d) The results show that the proposed ANN model outperforms: • Regression models by 8% • UCP models by 50% 27
  • 28. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 28
  • 32. Requirements Analysis Lexicons Phase-I Phase –II User Noun Actor fills Verb Action the Article ------- form Noun Object 32
  • 33. Requirements • Reverse engineering where we have legacy systems that are critical to the operation of an organization which uses them and that must still be maintained. • Most legacy systems were developed before software engineering techniques were widely used. Thus they may be poorly structured and their documentation may be either out-of-date or non-existent. • In order to bring to bear the legacy system maintenance, the first task is to recover the design or specification of a legacy system from its source or executable code 33
  • 34. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 34
  • 35. Design 1. Finding Fault Prone components for reuse 2. UI Design 35
  • 36. Components Re-use • Software quality classification models can be used to indicate which program modules are fault-prone (FP) and not fault-prone (NFP). • These models can be used to select the best candidate modules. 36
  • 37. Components Re-use Attribute U_1 Number of unique operators N_1 Total number of operators U_2 Number of unique operands N_2 Total number of operands V(G) McCabe’s cyclomatic complexity N_L Number of logical operators LOC Lines of code ELOC Executable lines of code 37
  • 38. User Interface Design • Learnability is an important aspect of usability • users lose up to 40% of their time due to “frustrating experiences” with computers, with one of the most common causes of these frustrations being missing, hard to find, and unusable features of the software. 38
  • 39. User Interface Design • Nielsen defines that a highly learnable system could be categorized as “allowing users to reach a reasonable level of usage proficiency within a short time”. • Web usage map is mined through Label Sequential Rule 39
  • 41. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 41
  • 42. Implementation • Implementation is a core process in the software engineering life cycle. • One of the challenges in this phase is the modularization –or remodularization-. • Genetic algorithms have been successfully used to address this problem. • The objective is to improve the module quality (MQ). All versions of MQ are combinations of cohesion and coupling into a single weighted fitness function. 42
  • 43. Implementation (Cont’d) • Clustering has also been applied to package coupling, to reduce overall package size and to explore the relationship between design and code level software structure. • Additional objectives might include closeness to original module structure, business goals, technical constraints, testability, and other metrics that may be important in finding a good module structure. 43
  • 45. Implementation (Cont’d) • Refactoring is to rewrite existing source code in order to improve its readability, reusability or structure without affecting its meaning or behavior. • For project managers it is interesting to know which locations are likely to demand refactoring. Refactoring improves the understandability of the code, but on the other hand requires development time 45
  • 46. Implementation (Cont’d) • Researches screen evolution data from versioning systems of open source projects. • ArgoUML and the Spring framework are examples developed in Java and consist of 5000 and 10000 classes each. • Each class is usually placed in a separate file in Java, thus they use files equivalent to classes and focus on files for our analysis. 46
  • 47. Implementation (Cont’d) The used features can be divided into different categories: • Size This category contains size measures such as lines of code from an evolution perspective: linesAdded, linesModified, or linesDeleted relative to the total LOC (lines of code) of a file. 47
  • 48. Implementation (Cont’d) • Team The number of authors of files influences the way software is developed. It is expected that the more authors are working on the changes the higher the probability of rework and mistakes. • Complexity of existing solution According to the laws of software evolution, software continuously becomes more and more complex. Changes are more difficult to add as the software is more difficult to understand and the contracts between existing parts have to retain. As a result they investigate the changeCount in relation to the number of changes during the entire history of each file. 48
  • 49. Implementation (Cont’d) • New Requirements In software development projects usually new classes are added to object-oriented systems when new requirements have to be satisfied. They use the information whether a file was newly introduced during the prediction period • Relational Aspects One of the most important features of this category are couplings such as the number of changes/revisions where other files have been committed with. 49
  • 50. Implementation (Cont’d) • With the described features, the number of refactorings is predicted 50
  • 51. Implementation (Cont’d) • Decision tree and neural network are used as classifiers. • The F-measure was about 65%. • It is clear that several features such as lines activity rate and number of lines altered per commit provide much information for the assessment of refactorings. • But also the structure of the system is crucial for refactorings, as the number of co-changed files and the number of files introduced during the maintenance are relevant features. 51
  • 52. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 52
  • 53. Testing • Software quality models help ensure the reliability of the delivered products. • Early detection of fault-prone software components enables verification experts to concentrate their time and resources on the problem areas of the software system under development. • Accurate prediction of fault-prone modules enables the verification and validation activities focused on the critical software components. 53
  • 55. Testing (Cont’d) • Decision trees correctly predicted 79.3% of high development effort fault-prone modules (detection rate), while the trees generated from the best parameter combinations correctly identified 88.4% of those modules on the average. 55
  • 56. Agenda • Introduction • Software Engineering Phases • Machine Learning Overview • Applications of ML in SWE with each process: – Project Planning – Requirements – Design – Implementation – Testing – Maintenance • Conclusion 56
  • 57. Maintenance • Software maintenance is widely recognized to be the most expensive and time-consuming aspect of the software process. • A relevance relation maps a tuple of system elements to a value indicating how related they are. • These software change repositories reflect a history of the system, which includes actions that result in the creation of new relationships and the strengthening of the existing relationships in the software. 57
  • 59. Maintenance (Cont’d) • Software entities include documents, source files, routines, modules, variables, and even the entire software system. • A relevance relation is a predictor that maps tuples of two or more software entities to a value r quantifying how relevant, that is, connected or related, the entities are to each other. • r shows the strength of relevance among the entities. 59
  • 61. Maintenance Effort Prediction • If the predictions are based on formal software development effort prediction models, such as the estimation part of the Function Point Analysis, essential differences in characteristics between software development and software maintenance are neglected • The focus of software development is the creation of software, but the focus of software maintenance is more the change of software. • The development of a software application typically is a one-of-a-kind project, but the maintenance activities on an application usually comprise a large number of tasks carried out over a long period of time in a relatively stable environment. 62
  • 62. Maintenance Effort Prediction • Some researches collected data on: – 109 randomly selected maintenance tasks – 70 applications – The size of the applications varied from a few thousand lines of code (LOC) to about 500,000 LOC – the age of the applications varied from less than a year to more than 20 years – The functions of the applications included payroll, order entry, billing and invoicing, inventory control, service management, and personnel administration. 63
  • 63. Maintenance Effort Prediction The following data was collected for each maintenance task: • Type of maintenance task, i.e., corrective or perfective. • Priority of task, i.e., high, medium or low priority. • Maintainer’s knowledge and confidence about how to solve the task immediately after having read or heard the task specification. • Years of experience as maintainer, and on the maintained application. • Education level of the maintainer. • Work-hours (effort) spent on the task. • Task size and the programming language • Age and size of the changed application. 64
  • 64. Maintenance Effort Prediction Most Important features: • Cause: Corrective maintenance = 0, otherwise = 1 • Change: More than 50% of the effort is believed to be spent on updating of code compared to inserting and deleting the code = 0, otherwise = l • Mode: More than 50% of the effort is believed to be spent on development of new modules (New module mode) = 0, otherwise (Embedded mode) = 1 • Confidence: The maintainer believes he knows how to solve the task when the task specification is read/heard the first time = 0 (High confidence), otherwise = 1 (Medium or low confidence). 65
  • 65. Maintenance Effort Prediction Less effect features: • Type of language • Maintainer experience • Task priority • Application age • Application size 66
  • 66. Maintenance Effort Prediction • Neural network and regression were used as approaches for effort prediction. • The prediction accuracy was acceptable (error of 60%). • A recommended use of an effort prediction model is, therefore, to support the expert predictions. • Another important use of a formal prediction model may be to support the collection and analysis of maintenance data in order to enable improvement of the maintenance process and product. 67
  • 67. Open Problems • Most of presented work are immature and a lot of related issues are still open. • Machine learning can help in the requirements engineering phase in developing knowledge based systems and ontologies to manage the requirements and model problem domains 68
  • 68. Open Problems (Cont’d) • One of the most difficult problems is the problem of transforming requirements into architectures. Much research is needed in this area to address the ever increasing complexity of functional and non-functional requirements. 69
  • 69. Open Problems (Cont’d) • One area that has received some attention is the use of automated algorithms with machine learning to make repair assignments. • In any case, more studies with respect to the appropriate criteria for selecting assignment policy, reward mechanisms and management goals need to be undertaken. 70
  • 70. Conclusion • The existing work certainly proves that the field of software engineering is a fertile ground for the application of machine learning methods. • It is clear that there is an increased interest in the niche area of machine learning and software engineering. 71
  • 71. Conclusion (cont’d) • The strength of machine learning methods lies in the fact that they have sound mathematical and logical justifications • The power of machine learning methods does not come from a particular induction method, but instead from proper formulation of the problems and from crafting the representation to make learning tractable. 72
  • 72. Conclusion (cont’d) • Machine learning can play a good role in the different phases of software engineering; project planning, requirements analysis, design, implementation, testing, and even in maintenance • It is expected that this interest in applying machine learning in software engineering tasks will increase significantly especially with the increase interest in the empirical software engineering. 73
  • 73. Thank you very much 74