SlideShare a Scribd company logo
Mining Features from the Object-Oriented Source
Code of a Collection of Software Variants Using
Formal Concept Analysis and Latent Semantic
Indexing
SEKE 2013, Boston, 29 june 1
Outline
• The context and the issue
• Our goal and the main hypotheses
• Our approach : The main ideas
• The process : step by step
• Experimentation and results
• Perspectives
SEKE 2013, Boston, 29 june 2
The context (1/4)
Variant AVariant A
Variant CVariant C
Variant BVariant B
Variant EVariant E
variantvariant
Variant 1Variant 1
Variant2Variant2
VariantGVariantG
VariantF6VariantF6
Variant3Variant3
Variant
AB
Variant
AB
VariantC
F3
VariantC
F3SEKE 2013, Boston, 29 june 3
The context (2/4)
• Software product Line
– SPL supports efficient development of related software
products
– Manages common and optional features
• A Feature is a system property relevant to some stakeholder
used to capture commonalities or variations among systems in a
family
– Promotes systematic software reuse from SPL’s core
assets (such as features, code, documentation and etc.)
SEKE 2013, Boston, 29 june 4
The context (3/4)
Image from : http://poltman.com/pm-en/img/TechnicalInformation/SoftwareModernization/ProductLines/ProductLines-01.jpg
 Software Product Line
 Domain Engineering : development for reuse
 Application Engineering : development by reuse
SEKE 2013, Boston, 29 june 5
The context (4/4)
• Software Product Line
– Feature model (FM)
• Is a tree-like graph of features and relationships among
them
• Used to represent commonality and variability of SPL
members at different levels of abstraction
SEKE 2013, Boston, 29 june 6
Issue
• Software variants
– Difficulties for :
• Reuse
• Maintenance
• Comprehension
• Impact analysis
• Software Product Line
– Design from scratch is a hard task (domain engineering)
AA
CC
BB
33
ABAB
SEKE 2013, Boston, 29 june 7
Our Goal (1/2)
• Reengineering existing software variants into a software
product line
– Benefits
• Software variants will be managed as a product line
• Software product line will be engineered started from existing products
(not from scratch)
– Strategy
• Feature model mining (reverse engineering step)
– Mining features
– Mining feature model structure (group of features)
– Mining feature constraints
– Mining feature relationships
• Source code Framework generation (reengineering step)
AA
CC
BB
33
ABAB
33 BB
CC
AA
SEKE 2013, Boston, 29 june 8
Our Goal (2/2)
AA
CC
BB
33
ABAB
33 BB
CC
AA
From existing software variants To new product generation
SEKE 2013, Boston, 29 june 9
To domain features From domain features
Our main hypotheses
1. Mining feature From object oriented source code
1. Focus on functional features
– Functional features express the behavior or the way users may
interact with a product
1. Focus on feature implemented at the programming level
– The elements of the source code reflect these features
– Feature are implemented as package, class, attribute, method, local variable, attribute access,
method invocation, etc.
1. A Feature has the same implementation in all product
variants where it is present (we not consider evolution)
SEKE 2013, Boston, 29 june 10
The main ideas(1/4)
1. The initial search space
is composed of all OO
elements of software
variants
1. Characterizing OO
elements implementing
features to cluster
theme together
–Are similar elements :
lexical similarity
–Are dependent elements :
structural dependencies
SEKE 2013, Boston, 29 june 11
Identifying clusters
composed of the most
similar OO elements
based on LSI
The main ideas (2/4)
3. Optional
features are
implemented as
variation in
source code
Mining variations
(= search space)
Package
Variation
Package Set
Variation
Package Content
VariationClass
Variation
Class Content
Variation
Class Signature
Variation
Attributes Set
Variation
Methods Set
Variation
Method
Variation
Signature Body
Attribute
Variation
( Access Level, Data Type)
1
:
2
:
3
:
4
:
(Name)
(Name)
Relation
ship
Public ,
Private, ...
Access
Level
Access
Level
Returned Data
Type
Parameters List order &
data type
Local
variables
Method
Invocation
Attribute
Access
Inheritance (Superclass),
Interface
SEKE 2013, Boston, 29 june 12
The main ideas(3/4)
4. Reducing the search space by
Isolating groups of variations
corresponding to some
related features
From one large search space to
many sub search spaces
SEKE 2013, Boston, 29 june 13
Identifying all groups of OO elements
representing
differences or intersections between variants
based on FCA
The main ideas (4/4)
Common Block (CB)
Common Atomic Block (CAB)
Block of Variation (BV)
Atomic Block of Variation (AB)
Software_1
Software_2Software_3
1 2
3
1
SEKE 2013, Boston, 29 june 14
Used techniques : FCA and LSI (1/3)
• Formal Concept Analysis (FCA)
– Is a technique for data analysis and knowledge
representation based on lattice theory
– It identifies meaningful groups of objects that share
common attributes
– It provides a theoretical model to analyze hierarchies of
these groups
– In order to apply FCA based on the definition of a formal
context or incidence table of objects and their attributes
SEKE 2013, Boston, 29 june 15
Used techniques : FCA and LSI (2/3)
SEKE 2013, Boston, 29 june 16
jungle water forest fish plant mamal
lion x x
carp x x
dolphin x x
bear x x
zebra x x
pine x x
Extracted from : http://code.google.com/p/erca/wiki/FcaIntroduction
Used techniques : FCA and LSI (3/3)
• Latent Semantic Indexing (LSI)
– Compute textual similarity among different documents
• Based on the occurrences of terms in documents
– If two documents share a large number of terms, those documents are
considered to be similar
– Three steps
• A corpus of documents is built after pre-processing such as stop word
removal and stemming performing
• A term-by-document matrix is built, where each column represents a
document and each row is a term. The values in the matrix indicate the
frequency of the term occurring in the document
• The similarity among documents is calculated using cosine similarity
SEKE 2013, Boston, 29 june 17
The mining process : step by step
P1
P2
Software Product Variants
Mandatory
Feature
Pn
Static Analysis
Lexical Similarity
Computation
Clustering
Implementation Space
Feature Space
)1
Commonalities and
Variabilities Computation
Clustering
)2
Optional
Feature
Clustering
OBEs
Similarity
Matrix
Lexical Similarity
Computation
Similarity
Matrix
Common Block
Common OBEs
(Block of Variation)n
Variable OBEs
Common
Atomic Block
Atomic Block
of Variation
Features
FCA
LSI
LSI
FCA
FCA
SEKE 2013, Boston, 29 june
18
Identifying the Common Block and
Blocks of Variation (1/3)
• Two steps
1. A formal context, where objects are product variants and
attributes are OBEs is defined
1. Calculate corresponding AOC-poset
• The intent of each concept represents OBEs common to two or more
products
– The intent of the most general (i.e., top) concept gathers OBEs that are
common to all products. They constitute the CB
– The intents of all remaining concepts are BVs
» They gather sets of OBEs common to a subset of products and
correspond to the implementation of one or more features
• The extent of each of these concepts is the set of products having
these OBEs in commonSEKE 2013, Boston, 29 june 19
Identifying the Common Block and
Blocks of Variation(2/3)
• Example
SEKE 2013, Boston, 29 june 20
The formel context
Identifying the Common Block and
Blocks of Variation(3/3)
• Example
– The AOC-poset
Common Block
Block of
Variation
Block of
Variation
SEKE 2013, Boston, 29 june 21
Identifying Atomic Blocks (1/5)
• Three steps
– Exploring the BV’s AOC-poset to Identify Atomic Blocks of Variation
– Measuring OBEs’ Similarity Based on LSI
– Identifying Atomic Blocks Using FCA
• Exploring the BV’s AOC-poset to Identify Atomic Blocks of Variation
– Exploring the AOC-poset from the smallest (bottom) to the highest
(top) block
– If a group of OBEs is identified as an ABV, this group is considered as
such when exploring the following BV
– For Common Atomic Blocks (CAB), there is no such need to explore
the AOC-poset as there is a unique CB.
SEKE 2013, Boston, 29 june 22
Identifying Atomic Blocks (2/5)
• Measuring OBEs’ Similarity Based on LSI
– Building the LSI corpus
– Building the term- document matrix and the term-
query matrix for each BV and for the CB
– Building the cosine similarity matrix
SEKE 2013, Boston, 29 june 23
Identifying Atomic Blocks (3/5)
• Example of cosine similarity matrix
SEKE 2013, Boston, 29 june 24
Identifying Atomic Blocks (4/5)
• Identifying Atomic Blocks Using FCA
– Transforming the (numerical) similarity matrices of previous step into
(binary) formal contexts
• Only pairs of OBEs having a calculated similarity greater than or equal to 0.70 are
considered similar
– Example
SEKE 2013, Boston, 29 june 25
Identifying Atomic Blocks (5/5)
• Identifying Atomic Blocks Using FCA
SEKE 2013, Boston, 29 june 26
Experimentation and results (1/5)
• Case studies : Two Java open-source software: Mobile Media and
ArgoUML
SEKE 2013, Boston, 29 june 27
Experimentation and results (2/5)
SEKE 2013, Boston, 29 june 28
Experimentation and results (3/5)
• The effectiveness of IR methods is measured by their RECALL,
PRECISION and F-MEASURE
 Recall is the percentage of correctly
retrieved links (OBEs) to the total
number of relevant links (OBEs) .
 Precision is the percentage of
correctly retrieved links (OBEs) to the
total number of retrieved links (OBEs) .
 F-measure is a balanced measure that
takes into account both precision and
recall.
SEKE 2013, Boston, 29 june 29
Experimentation and results (4/5)
• Precision
– For optional features appears to be high
• This means that all mined OBEs grouped as features are relevant
• Mainly due to search space reduction. In most cases, each BV
corresponds to one and only one feature
– For common features, precision is also quite high
• Thanks to our clustering technique that identifies ABVs based on
FCA and LSI
• Is smaller than the one obtained for optional features
–This deterioration can be explained by the fact that we do not perform
search space reduction for the CB
SEKE 2013, Boston, 29 june 30
Experimentation and results (5/5)
• Recall
– Its average value is 66% for Mobile Media and 67%
for ArgoUML
• This means most OBEs that compose features are mined
• Non-mined OBEs used different vocabularies compared to
the mined ones
– This is a known limitation of LSI which is based on lexical similarity
SEKE 2013, Boston, 29 june 31
Perspectives
• Enhance the quality of the mining
– Combine both textual and structural similarity measures
– Identify junctions between features
– More reducing of the search space
– Etc.
• Feature model mining
– Mining features
– Mining feature model structure (group of features)
– Mining features constraints
– Mining feature relationships
SEKE 2013, Boston, 29 june 32
Mining Features from the Object-Oriented Source
Code of a Collection of Software Variants Using
Formal Concept Analysis and Latent Semantic
Indexing
SEKE 2013, Boston, 29 june 33
Object-to-feature mapping model
SEKE 2013, Boston, 29 june 34

More Related Content

Viewers also liked

AutoCardSorter - Designing the Information Architecture of a web site using L...
AutoCardSorter - Designing the Information Architecture of a web site using L...AutoCardSorter - Designing the Information Architecture of a web site using L...
AutoCardSorter - Designing the Information Architecture of a web site using L...Christos Katsanos
 
Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationChristoph Trattner
 
Latent Semantic Indexing and Search Engines Optimimization (SEO)
Latent Semantic Indexing and Search Engines Optimimization (SEO)Latent Semantic Indexing and Search Engines Optimimization (SEO)
Latent Semantic Indexing and Search Engines Optimimization (SEO)muzzy4friends
 
SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011aneeshabakharia
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionCory Andrew Henson
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureRakuten Group, Inc.
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML, Inc
 
7 - Architetture Software - Software product line
7 - Architetture Software - Software product line7 - Architetture Software - Software product line
7 - Architetture Software - Software product lineMajong DevJfu
 
An approach to source code plagiarism
An approach to source code plagiarismAn approach to source code plagiarism
An approach to source code plagiarismvarsha_bhat
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesJinYeong Bak
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003Ajay Ohri
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiSocial Media Camp
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationElaheh Barati
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all thatZhibo Xiao
 

Viewers also liked (19)

AutoCardSorter - Designing the Information Architecture of a web site using L...
AutoCardSorter - Designing the Information Architecture of a web site using L...AutoCardSorter - Designing the Information Architecture of a web site using L...
AutoCardSorter - Designing the Information Architecture of a web site using L...
 
Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human Categorization
 
Latent Semantic Indexing and Search Engines Optimimization (SEO)
Latent Semantic Indexing and Search Engines Optimimization (SEO)Latent Semantic Indexing and Search Engines Optimimization (SEO)
Latent Semantic Indexing and Search Engines Optimimization (SEO)
 
Practical Machine Learning
Practical Machine Learning Practical Machine Learning
Practical Machine Learning
 
SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 Release
 
7 - Architetture Software - Software product line
7 - Architetture Software - Software product line7 - Architetture Software - Software product line
7 - Architetture Software - Software product line
 
An approach to source code plagiarism
An approach to source code plagiarismAn approach to source code plagiarism
An approach to source code plagiarism
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
 
Naive Bayes | Statistics
Naive Bayes | StatisticsNaive Bayes | Statistics
Naive Bayes | Statistics
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all that
 
C4.5
C4.5C4.5
C4.5
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 

Similar to Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing

Object Oriented Design
Object Oriented DesignObject Oriented Design
Object Oriented DesignAMITJain879
 
Mining Features from the Object-Oriented Source Code of Software Variants by ...
Mining Features from the Object-Oriented Source Code of Software Variants by ...Mining Features from the Object-Oriented Source Code of Software Variants by ...
Mining Features from the Object-Oriented Source Code of Software Variants by ...Ra'Fat Al-Msie'deen
 
UNIT V TESTING.pptx
UNIT V TESTING.pptxUNIT V TESTING.pptx
UNIT V TESTING.pptxanguraju1
 
Concept lattices: a representation space to structure software variability
Concept lattices: a representation space to structure software variabilityConcept lattices: a representation space to structure software variability
Concept lattices: a representation space to structure software variabilityRa'Fat Al-Msie'deen
 
CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V pkaviya
 
Some SBML-related resources at SBML.org
Some SBML-related resources at SBML.orgSome SBML-related resources at SBML.org
Some SBML-related resources at SBML.orgMike Hucka
 
Extending the Product Selection Language of the Abstract Behavioral Specifica...
Extending the Product Selection Language of the Abstract Behavioral Specifica...Extending the Product Selection Language of the Abstract Behavioral Specifica...
Extending the Product Selection Language of the Abstract Behavioral Specifica...Triando Triando
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysisMahesh Bhalerao
 
04 distance learning standards-scorm specification
04 distance learning standards-scorm specification04 distance learning standards-scorm specification
04 distance learning standards-scorm specification宥均 林
 
Fairport domain specific metadata using w3 c dcat & skos w ontology views
Fairport domain specific metadata using w3 c dcat & skos w ontology viewsFairport domain specific metadata using w3 c dcat & skos w ontology views
Fairport domain specific metadata using w3 c dcat & skos w ontology viewsTim Clark
 
unit2 object oriented Methodologies ppt.pptx
unit2 object oriented  Methodologies ppt.pptxunit2 object oriented  Methodologies ppt.pptx
unit2 object oriented Methodologies ppt.pptxSARANYAM124686
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...ijseajournal
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...ijseajournal
 
empirical-SLR.pptx
empirical-SLR.pptxempirical-SLR.pptx
empirical-SLR.pptxJitha Kannan
 
TOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEW
TOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEWTOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEW
TOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEWijseajournal
 

Similar to Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing (20)

Object Oriented Design
Object Oriented DesignObject Oriented Design
Object Oriented Design
 
Unit 5
Unit 5Unit 5
Unit 5
 
Mining Features from the Object-Oriented Source Code of Software Variants by ...
Mining Features from the Object-Oriented Source Code of Software Variants by ...Mining Features from the Object-Oriented Source Code of Software Variants by ...
Mining Features from the Object-Oriented Source Code of Software Variants by ...
 
UNIT V TESTING.pptx
UNIT V TESTING.pptxUNIT V TESTING.pptx
UNIT V TESTING.pptx
 
Concept lattices: a representation space to structure software variability
Concept lattices: a representation space to structure software variabilityConcept lattices: a representation space to structure software variability
Concept lattices: a representation space to structure software variability
 
CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V CS8592 Object Oriented Analysis & Design - UNIT V
CS8592 Object Oriented Analysis & Design - UNIT V
 
How long can you afford to Stop The World?
How long can you afford to Stop The World?How long can you afford to Stop The World?
How long can you afford to Stop The World?
 
Some SBML-related resources at SBML.org
Some SBML-related resources at SBML.orgSome SBML-related resources at SBML.org
Some SBML-related resources at SBML.org
 
Extending the Product Selection Language of the Abstract Behavioral Specifica...
Extending the Product Selection Language of the Abstract Behavioral Specifica...Extending the Product Selection Language of the Abstract Behavioral Specifica...
Extending the Product Selection Language of the Abstract Behavioral Specifica...
 
Object oriented framework
Object oriented frameworkObject oriented framework
Object oriented framework
 
CS8592-OOAD Lecture Notes Unit-5
CS8592-OOAD Lecture Notes Unit-5 CS8592-OOAD Lecture Notes Unit-5
CS8592-OOAD Lecture Notes Unit-5
 
Object oriented analysis
Object oriented analysisObject oriented analysis
Object oriented analysis
 
04 distance learning standards-scorm specification
04 distance learning standards-scorm specification04 distance learning standards-scorm specification
04 distance learning standards-scorm specification
 
Fairport domain specific metadata using w3 c dcat & skos w ontology views
Fairport domain specific metadata using w3 c dcat & skos w ontology viewsFairport domain specific metadata using w3 c dcat & skos w ontology views
Fairport domain specific metadata using w3 c dcat & skos w ontology views
 
unit2 object oriented Methodologies ppt.pptx
unit2 object oriented  Methodologies ppt.pptxunit2 object oriented  Methodologies ppt.pptx
unit2 object oriented Methodologies ppt.pptx
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
 
empirical-SLR.pptx
empirical-SLR.pptxempirical-SLR.pptx
empirical-SLR.pptx
 
TOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEW
TOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEWTOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEW
TOWARDS CROSS-BROWSER INCOMPATIBILITIES DETECTION: ASYSTEMATIC LITERATURE REVIEW
 
Sub1583
Sub1583Sub1583
Sub1583
 

More from Ra'Fat Al-Msie'deen

SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdfSoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdfRa'Fat Al-Msie'deen
 
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdfBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdfRa'Fat Al-Msie'deen
 
Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Ra'Fat Al-Msie'deen
 
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug ReportsBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug ReportsRa'Fat Al-Msie'deen
 
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...Ra'Fat Al-Msie'deen
 
YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...
YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...
YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...Ra'Fat Al-Msie'deen
 
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...Ra'Fat Al-Msie'deen
 
Automatic Labeling of the Object-oriented Source Code: The Lotus Approach
Automatic Labeling of the Object-oriented Source Code: The Lotus ApproachAutomatic Labeling of the Object-oriented Source Code: The Lotus Approach
Automatic Labeling of the Object-oriented Source Code: The Lotus ApproachRa'Fat Al-Msie'deen
 
Constructing a software requirements specification and design for electronic ...
Constructing a software requirements specification and design for electronic ...Constructing a software requirements specification and design for electronic ...
Constructing a software requirements specification and design for electronic ...Ra'Fat Al-Msie'deen
 
Detecting commonality and variability in use-case diagram variants
Detecting commonality and variability in use-case diagram variantsDetecting commonality and variability in use-case diagram variants
Detecting commonality and variability in use-case diagram variantsRa'Fat Al-Msie'deen
 
Naming the Identified Feature Implementation Blocks from Software Source Code
Naming the Identified Feature Implementation Blocks from Software Source CodeNaming the Identified Feature Implementation Blocks from Software Source Code
Naming the Identified Feature Implementation Blocks from Software Source CodeRa'Fat Al-Msie'deen
 
Application architectures - Software Architecture and Design
Application architectures - Software Architecture and DesignApplication architectures - Software Architecture and Design
Application architectures - Software Architecture and DesignRa'Fat Al-Msie'deen
 
Planning and writing your documents - Software documentation
Planning and writing your documents - Software documentationPlanning and writing your documents - Software documentation
Planning and writing your documents - Software documentationRa'Fat Al-Msie'deen
 
Requirements management planning & Requirements change management
Requirements management planning & Requirements change managementRequirements management planning & Requirements change management
Requirements management planning & Requirements change managementRa'Fat Al-Msie'deen
 
Requirements change - requirements engineering
Requirements change - requirements engineeringRequirements change - requirements engineering
Requirements change - requirements engineeringRa'Fat Al-Msie'deen
 
Requirements validation - requirements engineering
Requirements validation - requirements engineeringRequirements validation - requirements engineering
Requirements validation - requirements engineeringRa'Fat Al-Msie'deen
 
Software Documentation - writing to support - references
Software Documentation - writing to support - referencesSoftware Documentation - writing to support - references
Software Documentation - writing to support - referencesRa'Fat Al-Msie'deen
 

More from Ra'Fat Al-Msie'deen (20)

SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdfSoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
SoftCloud: A Tool for Visualizing Software Artifacts as Tag Clouds.pdf
 
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdfBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
 
Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
 
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug ReportsBushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports
 
Source Code Summarization
Source Code SummarizationSource Code Summarization
Source Code Summarization
 
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
FeatureClouds: Naming the Identified Feature Implementation Blocks from Softw...
 
YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...
YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...
YamenTrace: Requirements Traceability - Recovering and Visualizing Traceabili...
 
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
Requirements Traceability: Recovering and Visualizing Traceability Links Betw...
 
Automatic Labeling of the Object-oriented Source Code: The Lotus Approach
Automatic Labeling of the Object-oriented Source Code: The Lotus ApproachAutomatic Labeling of the Object-oriented Source Code: The Lotus Approach
Automatic Labeling of the Object-oriented Source Code: The Lotus Approach
 
Constructing a software requirements specification and design for electronic ...
Constructing a software requirements specification and design for electronic ...Constructing a software requirements specification and design for electronic ...
Constructing a software requirements specification and design for electronic ...
 
Detecting commonality and variability in use-case diagram variants
Detecting commonality and variability in use-case diagram variantsDetecting commonality and variability in use-case diagram variants
Detecting commonality and variability in use-case diagram variants
 
Naming the Identified Feature Implementation Blocks from Software Source Code
Naming the Identified Feature Implementation Blocks from Software Source CodeNaming the Identified Feature Implementation Blocks from Software Source Code
Naming the Identified Feature Implementation Blocks from Software Source Code
 
Application architectures - Software Architecture and Design
Application architectures - Software Architecture and DesignApplication architectures - Software Architecture and Design
Application architectures - Software Architecture and Design
 
Planning and writing your documents - Software documentation
Planning and writing your documents - Software documentationPlanning and writing your documents - Software documentation
Planning and writing your documents - Software documentation
 
Requirements management planning & Requirements change management
Requirements management planning & Requirements change managementRequirements management planning & Requirements change management
Requirements management planning & Requirements change management
 
Requirements change - requirements engineering
Requirements change - requirements engineeringRequirements change - requirements engineering
Requirements change - requirements engineering
 
Requirements validation - requirements engineering
Requirements validation - requirements engineeringRequirements validation - requirements engineering
Requirements validation - requirements engineering
 
Software Documentation - writing to support - references
Software Documentation - writing to support - referencesSoftware Documentation - writing to support - references
Software Documentation - writing to support - references
 
Algorithms - "heap sort"
Algorithms - "heap sort"Algorithms - "heap sort"
Algorithms - "heap sort"
 
Algorithms - "quicksort"
Algorithms - "quicksort"Algorithms - "quicksort"
Algorithms - "quicksort"
 

Recently uploaded

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...Alluxio, Inc.
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024Ortus Solutions, Corp
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowPeter Caitens
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAlluxio, Inc.
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessWSO2
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAlluxio, Inc.
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandIES VE
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownloadvrstrong314
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptxGeorgi Kodinov
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsGlobus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion Clinic
 

Recently uploaded (20)

AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
AI/ML Infra Meetup | Improve Speed and GPU Utilization for Model Training & S...
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
AI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning FrameworkAI/ML Infra Meetup | Perspective on Deep Learning Framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAGAI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
AI/ML Infra Meetup | Reducing Prefill for LLM Serving in RAG
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
Abortion ^Clinic ^%[+971588192166''] Abortion Pill Al Ain (?@?) Abortion Pill...
 

Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing

  • 1. Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing SEKE 2013, Boston, 29 june 1
  • 2. Outline • The context and the issue • Our goal and the main hypotheses • Our approach : The main ideas • The process : step by step • Experimentation and results • Perspectives SEKE 2013, Boston, 29 june 2
  • 3. The context (1/4) Variant AVariant A Variant CVariant C Variant BVariant B Variant EVariant E variantvariant Variant 1Variant 1 Variant2Variant2 VariantGVariantG VariantF6VariantF6 Variant3Variant3 Variant AB Variant AB VariantC F3 VariantC F3SEKE 2013, Boston, 29 june 3
  • 4. The context (2/4) • Software product Line – SPL supports efficient development of related software products – Manages common and optional features • A Feature is a system property relevant to some stakeholder used to capture commonalities or variations among systems in a family – Promotes systematic software reuse from SPL’s core assets (such as features, code, documentation and etc.) SEKE 2013, Boston, 29 june 4
  • 5. The context (3/4) Image from : http://poltman.com/pm-en/img/TechnicalInformation/SoftwareModernization/ProductLines/ProductLines-01.jpg  Software Product Line  Domain Engineering : development for reuse  Application Engineering : development by reuse SEKE 2013, Boston, 29 june 5
  • 6. The context (4/4) • Software Product Line – Feature model (FM) • Is a tree-like graph of features and relationships among them • Used to represent commonality and variability of SPL members at different levels of abstraction SEKE 2013, Boston, 29 june 6
  • 7. Issue • Software variants – Difficulties for : • Reuse • Maintenance • Comprehension • Impact analysis • Software Product Line – Design from scratch is a hard task (domain engineering) AA CC BB 33 ABAB SEKE 2013, Boston, 29 june 7
  • 8. Our Goal (1/2) • Reengineering existing software variants into a software product line – Benefits • Software variants will be managed as a product line • Software product line will be engineered started from existing products (not from scratch) – Strategy • Feature model mining (reverse engineering step) – Mining features – Mining feature model structure (group of features) – Mining feature constraints – Mining feature relationships • Source code Framework generation (reengineering step) AA CC BB 33 ABAB 33 BB CC AA SEKE 2013, Boston, 29 june 8
  • 9. Our Goal (2/2) AA CC BB 33 ABAB 33 BB CC AA From existing software variants To new product generation SEKE 2013, Boston, 29 june 9 To domain features From domain features
  • 10. Our main hypotheses 1. Mining feature From object oriented source code 1. Focus on functional features – Functional features express the behavior or the way users may interact with a product 1. Focus on feature implemented at the programming level – The elements of the source code reflect these features – Feature are implemented as package, class, attribute, method, local variable, attribute access, method invocation, etc. 1. A Feature has the same implementation in all product variants where it is present (we not consider evolution) SEKE 2013, Boston, 29 june 10
  • 11. The main ideas(1/4) 1. The initial search space is composed of all OO elements of software variants 1. Characterizing OO elements implementing features to cluster theme together –Are similar elements : lexical similarity –Are dependent elements : structural dependencies SEKE 2013, Boston, 29 june 11 Identifying clusters composed of the most similar OO elements based on LSI
  • 12. The main ideas (2/4) 3. Optional features are implemented as variation in source code Mining variations (= search space) Package Variation Package Set Variation Package Content VariationClass Variation Class Content Variation Class Signature Variation Attributes Set Variation Methods Set Variation Method Variation Signature Body Attribute Variation ( Access Level, Data Type) 1 : 2 : 3 : 4 : (Name) (Name) Relation ship Public , Private, ... Access Level Access Level Returned Data Type Parameters List order & data type Local variables Method Invocation Attribute Access Inheritance (Superclass), Interface SEKE 2013, Boston, 29 june 12
  • 13. The main ideas(3/4) 4. Reducing the search space by Isolating groups of variations corresponding to some related features From one large search space to many sub search spaces SEKE 2013, Boston, 29 june 13 Identifying all groups of OO elements representing differences or intersections between variants based on FCA
  • 14. The main ideas (4/4) Common Block (CB) Common Atomic Block (CAB) Block of Variation (BV) Atomic Block of Variation (AB) Software_1 Software_2Software_3 1 2 3 1 SEKE 2013, Boston, 29 june 14
  • 15. Used techniques : FCA and LSI (1/3) • Formal Concept Analysis (FCA) – Is a technique for data analysis and knowledge representation based on lattice theory – It identifies meaningful groups of objects that share common attributes – It provides a theoretical model to analyze hierarchies of these groups – In order to apply FCA based on the definition of a formal context or incidence table of objects and their attributes SEKE 2013, Boston, 29 june 15
  • 16. Used techniques : FCA and LSI (2/3) SEKE 2013, Boston, 29 june 16 jungle water forest fish plant mamal lion x x carp x x dolphin x x bear x x zebra x x pine x x Extracted from : http://code.google.com/p/erca/wiki/FcaIntroduction
  • 17. Used techniques : FCA and LSI (3/3) • Latent Semantic Indexing (LSI) – Compute textual similarity among different documents • Based on the occurrences of terms in documents – If two documents share a large number of terms, those documents are considered to be similar – Three steps • A corpus of documents is built after pre-processing such as stop word removal and stemming performing • A term-by-document matrix is built, where each column represents a document and each row is a term. The values in the matrix indicate the frequency of the term occurring in the document • The similarity among documents is calculated using cosine similarity SEKE 2013, Boston, 29 june 17
  • 18. The mining process : step by step P1 P2 Software Product Variants Mandatory Feature Pn Static Analysis Lexical Similarity Computation Clustering Implementation Space Feature Space )1 Commonalities and Variabilities Computation Clustering )2 Optional Feature Clustering OBEs Similarity Matrix Lexical Similarity Computation Similarity Matrix Common Block Common OBEs (Block of Variation)n Variable OBEs Common Atomic Block Atomic Block of Variation Features FCA LSI LSI FCA FCA SEKE 2013, Boston, 29 june 18
  • 19. Identifying the Common Block and Blocks of Variation (1/3) • Two steps 1. A formal context, where objects are product variants and attributes are OBEs is defined 1. Calculate corresponding AOC-poset • The intent of each concept represents OBEs common to two or more products – The intent of the most general (i.e., top) concept gathers OBEs that are common to all products. They constitute the CB – The intents of all remaining concepts are BVs » They gather sets of OBEs common to a subset of products and correspond to the implementation of one or more features • The extent of each of these concepts is the set of products having these OBEs in commonSEKE 2013, Boston, 29 june 19
  • 20. Identifying the Common Block and Blocks of Variation(2/3) • Example SEKE 2013, Boston, 29 june 20 The formel context
  • 21. Identifying the Common Block and Blocks of Variation(3/3) • Example – The AOC-poset Common Block Block of Variation Block of Variation SEKE 2013, Boston, 29 june 21
  • 22. Identifying Atomic Blocks (1/5) • Three steps – Exploring the BV’s AOC-poset to Identify Atomic Blocks of Variation – Measuring OBEs’ Similarity Based on LSI – Identifying Atomic Blocks Using FCA • Exploring the BV’s AOC-poset to Identify Atomic Blocks of Variation – Exploring the AOC-poset from the smallest (bottom) to the highest (top) block – If a group of OBEs is identified as an ABV, this group is considered as such when exploring the following BV – For Common Atomic Blocks (CAB), there is no such need to explore the AOC-poset as there is a unique CB. SEKE 2013, Boston, 29 june 22
  • 23. Identifying Atomic Blocks (2/5) • Measuring OBEs’ Similarity Based on LSI – Building the LSI corpus – Building the term- document matrix and the term- query matrix for each BV and for the CB – Building the cosine similarity matrix SEKE 2013, Boston, 29 june 23
  • 24. Identifying Atomic Blocks (3/5) • Example of cosine similarity matrix SEKE 2013, Boston, 29 june 24
  • 25. Identifying Atomic Blocks (4/5) • Identifying Atomic Blocks Using FCA – Transforming the (numerical) similarity matrices of previous step into (binary) formal contexts • Only pairs of OBEs having a calculated similarity greater than or equal to 0.70 are considered similar – Example SEKE 2013, Boston, 29 june 25
  • 26. Identifying Atomic Blocks (5/5) • Identifying Atomic Blocks Using FCA SEKE 2013, Boston, 29 june 26
  • 27. Experimentation and results (1/5) • Case studies : Two Java open-source software: Mobile Media and ArgoUML SEKE 2013, Boston, 29 june 27
  • 28. Experimentation and results (2/5) SEKE 2013, Boston, 29 june 28
  • 29. Experimentation and results (3/5) • The effectiveness of IR methods is measured by their RECALL, PRECISION and F-MEASURE  Recall is the percentage of correctly retrieved links (OBEs) to the total number of relevant links (OBEs) .  Precision is the percentage of correctly retrieved links (OBEs) to the total number of retrieved links (OBEs) .  F-measure is a balanced measure that takes into account both precision and recall. SEKE 2013, Boston, 29 june 29
  • 30. Experimentation and results (4/5) • Precision – For optional features appears to be high • This means that all mined OBEs grouped as features are relevant • Mainly due to search space reduction. In most cases, each BV corresponds to one and only one feature – For common features, precision is also quite high • Thanks to our clustering technique that identifies ABVs based on FCA and LSI • Is smaller than the one obtained for optional features –This deterioration can be explained by the fact that we do not perform search space reduction for the CB SEKE 2013, Boston, 29 june 30
  • 31. Experimentation and results (5/5) • Recall – Its average value is 66% for Mobile Media and 67% for ArgoUML • This means most OBEs that compose features are mined • Non-mined OBEs used different vocabularies compared to the mined ones – This is a known limitation of LSI which is based on lexical similarity SEKE 2013, Boston, 29 june 31
  • 32. Perspectives • Enhance the quality of the mining – Combine both textual and structural similarity measures – Identify junctions between features – More reducing of the search space – Etc. • Feature model mining – Mining features – Mining feature model structure (group of features) – Mining features constraints – Mining feature relationships SEKE 2013, Boston, 29 june 32
  • 33. Mining Features from the Object-Oriented Source Code of a Collection of Software Variants Using Formal Concept Analysis and Latent Semantic Indexing SEKE 2013, Boston, 29 june 33
  • 34. Object-to-feature mapping model SEKE 2013, Boston, 29 june 34

Editor's Notes

  1. Manages common and optional features and promotes systematic software reuse from SPL’s core assets (such as features, code, documentation and etc.).
  2. Domain Engineering - or development for reuse. It is the development of core assets through domain analysis, design and implementation. Application Engineering : development by reuse. It is the development of final products, with the use of core assets and specific requirements expressed by customers.