SlideShare a Scribd company logo
1 of 17
A Conceptual Dependency Graph Based
Keyword Extraction Model for Source Code to
API Documentation Mapping
Prepared By
Nakul Sharma
Under Guidance of
Dr. Prasanth Yalla
Professor, Department of Computer Science and
Engineering.
Koneru Laxmiah Education Foundation.
Vijayawada, Andhra Pradesh
India
Table of Contents
 Introduction
 Background
 Mathematical Foundations
 Genesis of Research
 Proposed Methodology
 Results and Discussion
 Future Scope and Conclusion
 References
Introduction
Traditional key feature extraction techniques
• use terms or sentences from the project
source codes to form a unique code structure.
Almost all traditional document key phrase
extraction techniques
• represent a document collection as the
phrase or sentence matrix in which each row
denotes the phrase or sentence-id and
corresponding column represents the frequency
Introduction (Continued)
Main problem with the existing systems is that they ignore the
context based textual information.
Contextual Information hold more relevance especially when
undertaking any software change which effects not just the
current phase of project but also the previous phases and the
next phases.
Source Code Analysis also aids in checking the effect of
change on code.
In the proposed model, a weighted graph dependency model
is used to filter the candidate sets among the vertices for
contextual similarity computation.
Background
• Source Code Analysis
• Text Mining
• Document Representation
• Clustering
• NLP/CL
Mathematical Framework
• Centrality Measures
• Document Clustering
• Document Metrics
• Source Code Metrics
Genesis of Research
Work Done in Text Mining and its related fields
Research conducted by various authors
Related Work
Sr. No. Name of Authors Work Done in Brief
1 S. Mohammadi et.al new approach is presented to extract the
knowledge of dependency between
artifacts in the source code.
2 V. U. Gómez, et.al U. Gómez, et.al, proposed a semantic
model on the visually characterizing
source code modifications
3 S. L. Abebe et.al S. L. Abebe et.al has introduced a new
extraction scheme that is sufficiently
effective to extract domain concepts from
the source code.
4 S. Bajracharya, et al, S. Bajracharya, et al, developed a new
SCA framework to collect and analyze
open source code on a large scale
5 A. S. Yumaganov A. S. Yumaganov proposed to compare
different search models for similarity with
limitations on the source code
Related Work
Sr. No. Name of Authors Work Done in Brief
1 Dimitriou et.al A. Dimitriou et.al, introduced a new keyword
search of top-k-size on tree structured data
2 W. Ding W. Ding proposed a review of software
documentation process knowledge-based
techniques
3 Hussain et. al. Hussain et.al proposed a new software design
pattern classification and selection scheme.
4 Ibrahim et. al. Ibrahim et.al presented a scientometric re-
ranking technique
5 L. H. Lee et. al. L. H. Lee, et.al, used Bayesian text classification
to introduce high relevance keyword extraction
process
Related Work (Related to Software
Metrics)
Sr. No. Name of Authors Work Done in Brief
1 Dimitriou et.al A. Dimitriou et.al, introduced a new keyword search of top-k-
size on tree structured data
2 W. Ding W. Ding proposed a review of software documentation
process knowledge-based techniques
3 Hussain et. al. Hussain et.al proposed a new software design pattern
classification and selection scheme.
4 Ibrahim et. al. Ibrahim et.al presented a scientometric re-ranking technique
5 L. H. Lee et. al. L. H. Lee, et.al, used Bayesian text classification to introduce
high relevance keyword extraction process
Observations on Related Work
Large open source projects not considered in SCA
systems and tools developed
Existing system also do not take into
consideration the contextual keyphrases in
providing traceability links.
The current work proposes an alternative
contextual dependency graph based software
metrics in form of contextual similarity.
Proposed Methodology
Figure 1: Module-1
Project source
codes
Class parsing
Project API
documentation
Text pre-processing
Filtered API
documents
Code dependency
Graph
Proposed
Contextual
dependency graph
similarity
Pre-processing of API Documents
Proposed
Methodology
Phase 1: Source Code and API documents Pre-processing
Step 1: Read project source codes S.
Step 2: Read project API documents D.
Step 3: for each code Ci in S[]
Do
Parse source code Ci with methods M and Fields F.
Mi=ExtractMethods(Ci)
Fi=ExtractFields(Ci)
Mapping (Mi , Fi) to Ci
C1 (M1,F1)
C2 (M2,F2)
… …..
Cn (Mn,Fn)
done
Step 4: // Remove the duplicate methods and fields in each class
For each code Ci
Do
i i j
i i j
M Pr ob(M M / C);i j
F Pr ob(F F / C);i j
  
  
If( Mi!=0 AND Fi!=0)
Then
Remove Mi in Ci or Cj
Remove Fi in Ci or Cj
End if
Done
Results and Discussion
Project LDA ONTOSE Proposed Method
Apache Pluto 0.846 0.835 0.9436
Apache Commons
Collections
0.736 0.753 0.879
JEuclid 0.794 0.825 0.962
JFreeChart 0.773 0.874 0.921
Kyro 0.874 0.915 0.948
Future Scope and Conclusion
The current paper proposed a novel approach to find
the relationship between the source code to API
documents using the contextual dependency graph. A
two pronged approach is used in the proposed method.
The project source code is scanned for the relevant
metrics. On the other hand, from the API
documentation, necessary information is extracted.
Here, the dependency graph is used to compute the
contextual similarity computation between the source
code metrics and its API documents
References
Amir Hossein Rasekh, Amir Hossein Arshia, “Mining and discovery of hidden relationships between
software source codes and related textual Documents”, Digital Scholarship in the Humanities ,
Published by Oxford University Press on behalf of EADH., doi:10.1093/llc/fqx052,
Chun Yong Chong , Sai Peck Lee , Automatic Clustering Constraints Derivation from Object-Oriented
Software Using Weighted Complex Network with Graph Theory Analysis, The Journal of Systems &
Software (2017), doi: 10.1016/j.jss.2017.08.017
Anh Tuan Nguyen, Tien N. Nguyen, Graph-based Statistical Language Model for Code, 2015
IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE), 2015, Florence,
Italy, Page 858-862.
Lars Ackermann, Bernhard Volz, “model[NL]generation: Natural Language Model Extraction”,
DCM’13: Proceedings of the 2013 workshop on Domain Specific Modeling: ACM New York,USA.
F Meziane, N. Athanasakis, S. Ananiadou, "Generating Natural Lanuage Specifications from UML
Class diagrams", Requirement Engineering Journal, 13(1):1-18, Springer-Verlag, London.
Fabian Friedrich, Jan Mendling, Frank Puhlmann, “Process Model Generation from Natural
Language Text”, In Advanced Information Systems Engineering, Eds. Lecture Notes in Computer
Science. Springer Berlin Heidelberg, Berlin, Heidelberg, 482-496.

More Related Content

What's hot

International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...
Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...
Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...IRJET Journal
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...Tao Xie
 
Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...
Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...
Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...E Rey Garcia, MPA, DCS-EIS Candidate
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
 
EXTRACTING ARABIC RELATIONS FROM THE WEB
EXTRACTING ARABIC RELATIONS FROM THE WEBEXTRACTING ARABIC RELATIONS FROM THE WEB
EXTRACTING ARABIC RELATIONS FROM THE WEBijcsit
 
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...Tao Xie
 
Machine learning with graph
Machine learning with graphMachine learning with graph
Machine learning with graphDing Li
 
Designing Cross-Language Information Retrieval System using various Technique...
Designing Cross-Language Information Retrieval System using various Technique...Designing Cross-Language Information Retrieval System using various Technique...
Designing Cross-Language Information Retrieval System using various Technique...IRJET Journal
 
Component Search and Retrieval
Component Search and RetrievalComponent Search and Retrieval
Component Search and RetrievalEduardo Cruz
 
Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...
Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...
Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...TELKOMNIKA JOURNAL
 
Computing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engineComputing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search enginecsandit
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeIJMTST Journal
 
A study of code change patterns for adaptive maintenance with AST analysis
A study of code change patterns for  adaptive maintenance with AST analysis A study of code change patterns for  adaptive maintenance with AST analysis
A study of code change patterns for adaptive maintenance with AST analysis IJECEIAES
 
Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...
Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...
Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...IncQuery Labs
 

What's hot (19)

International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...
Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...
Mining Query Log to Suggest Competitive Keyphrases for Sponsored Search Via I...
 
Iwesep19.ppt
Iwesep19.pptIwesep19.ppt
Iwesep19.ppt
 
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
MSRA 2018: Intelligent Software Engineering: Synergy between AI and Software ...
 
Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...
Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...
Final Assignment - Evaluating Scholarly Articles - Area of Research Interest ...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...
 
EXTRACTING ARABIC RELATIONS FROM THE WEB
EXTRACTING ARABIC RELATIONS FROM THE WEBEXTRACTING ARABIC RELATIONS FROM THE WEB
EXTRACTING ARABIC RELATIONS FROM THE WEB
 
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
SETTA'18 Keynote: Intelligent Software Engineering: Synergy between AI and So...
 
Software bug prediction
Software bug prediction Software bug prediction
Software bug prediction
 
Machine learning with graph
Machine learning with graphMachine learning with graph
Machine learning with graph
 
Designing Cross-Language Information Retrieval System using various Technique...
Designing Cross-Language Information Retrieval System using various Technique...Designing Cross-Language Information Retrieval System using various Technique...
Designing Cross-Language Information Retrieval System using various Technique...
 
Enase20.ppt
Enase20.pptEnase20.ppt
Enase20.ppt
 
Component Search and Retrieval
Component Search and RetrievalComponent Search and Retrieval
Component Search and Retrieval
 
Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...
Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...
Semi-Supervised Keyphrase Extraction on Scientific Article using Fact-based S...
 
Examination of Document Similarity Using Rabin-Karp Algorithm
Examination of Document Similarity Using Rabin-Karp AlgorithmExamination of Document Similarity Using Rabin-Karp Algorithm
Examination of Document Similarity Using Rabin-Karp Algorithm
 
Computing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engineComputing semantic similarity measure between words using web search engine
Computing semantic similarity measure between words using web search engine
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood Knowledge
 
A study of code change patterns for adaptive maintenance with AST analysis
A study of code change patterns for  adaptive maintenance with AST analysis A study of code change patterns for  adaptive maintenance with AST analysis
A study of code change patterns for adaptive maintenance with AST analysis
 
Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...
Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...
Towards the Next Generation of Reactive Model Transformations on Low-Code Pla...
 

Similar to A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code to API Documentation Mapping

Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application ModelsMarco Brambilla
 
A Survey on Design Pattern Detection Approaches
A Survey on Design Pattern Detection ApproachesA Survey on Design Pattern Detection Approaches
A Survey on Design Pattern Detection ApproachesCSCJournals
 
Automatic Code Completion Exploting Semantic Similarity
Automatic Code Completion Exploting Semantic SimilarityAutomatic Code Completion Exploting Semantic Similarity
Automatic Code Completion Exploting Semantic SimilarityMasud Rahman
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...ijseajournal
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...ijseajournal
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITYSOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITYIJDKP
 
Query expansion using novel use case scenario relationship for finding featur...
Query expansion using novel use case scenario relationship for finding featur...Query expansion using novel use case scenario relationship for finding featur...
Query expansion using novel use case scenario relationship for finding featur...IJECEIAES
 
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSPATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSkevig
 
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSPATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSijnlc
 
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...cscpconf
 
An effective citation metadata extraction process based on BibPro parser
An effective citation metadata extraction process based on BibPro parserAn effective citation metadata extraction process based on BibPro parser
An effective citation metadata extraction process based on BibPro parserIOSR Journals
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebIJwest
 
Semantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based SystemSemantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based Systemijcnes
 
Paper id 25201463
Paper id 25201463Paper id 25201463
Paper id 25201463IJRAT
 
LIS688_Group1
LIS688_Group1 LIS688_Group1
LIS688_Group1 e_chae
 

Similar to A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code to API Documentation Mapping (20)

Searching Repositories of Web Application Models
Searching Repositories of Web Application ModelsSearching Repositories of Web Application Models
Searching Repositories of Web Application Models
 
A Survey on Design Pattern Detection Approaches
A Survey on Design Pattern Detection ApproachesA Survey on Design Pattern Detection Approaches
A Survey on Design Pattern Detection Approaches
 
Automatic Code Completion Exploting Semantic Similarity
Automatic Code Completion Exploting Semantic SimilarityAutomatic Code Completion Exploting Semantic Similarity
Automatic Code Completion Exploting Semantic Similarity
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
 
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
A DATA EXTRACTION ALGORITHM FROM OPEN SOURCE SOFTWARE PROJECT REPOSITORIES FO...
 
Ju3517011704
Ju3517011704Ju3517011704
Ju3517011704
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITYSOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
SOURCE CODE RETRIEVAL USING SEQUENCE BASED SIMILARITY
 
H1803044651
H1803044651H1803044651
H1803044651
 
Query expansion using novel use case scenario relationship for finding featur...
Query expansion using novel use case scenario relationship for finding featur...Query expansion using novel use case scenario relationship for finding featur...
Query expansion using novel use case scenario relationship for finding featur...
 
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSPATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
 
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSPATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHS
 
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
Object surface segmentation, Image segmentation, Region growing, X-Y-Z image,...
 
An effective citation metadata extraction process based on BibPro parser
An effective citation metadata extraction process based on BibPro parserAn effective citation metadata extraction process based on BibPro parser
An effective citation metadata extraction process based on BibPro parser
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic Web
 
Semantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based SystemSemantic Search of E-Learning Documents Using Ontology Based System
Semantic Search of E-Learning Documents Using Ontology Based System
 
Sub1583
Sub1583Sub1583
Sub1583
 
Paper id 25201463
Paper id 25201463Paper id 25201463
Paper id 25201463
 
LIS688_Group1
LIS688_Group1 LIS688_Group1
LIS688_Group1
 

More from Nakul Sharma

Visualizing UML’s Sequence and Class Diagrams Using Graph-Based Clusters
Visualizing UML’s Sequence and   Class Diagrams Using Graph-Based Clusters  Visualizing UML’s Sequence and   Class Diagrams Using Graph-Based Clusters
Visualizing UML’s Sequence and Class Diagrams Using Graph-Based Clusters Nakul Sharma
 
Mapping and visualization of source code a survey
Mapping and visualization of source code a surveyMapping and visualization of source code a survey
Mapping and visualization of source code a surveyNakul Sharma
 
Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineeringNakul Sharma
 
Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Nakul Sharma
 
Possibility of interdisciplinary research software engineering and
Possibility of interdisciplinary research software engineering andPossibility of interdisciplinary research software engineering and
Possibility of interdisciplinary research software engineering andNakul Sharma
 
Session on machine translation batu 19 march2016
Session on machine translation batu 19 march2016Session on machine translation batu 19 march2016
Session on machine translation batu 19 march2016Nakul Sharma
 
Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineeringNakul Sharma
 
Statistical machine translation for indian language copy
Statistical machine translation for indian language   copyStatistical machine translation for indian language   copy
Statistical machine translation for indian language copyNakul Sharma
 

More from Nakul Sharma (8)

Visualizing UML’s Sequence and Class Diagrams Using Graph-Based Clusters
Visualizing UML’s Sequence and   Class Diagrams Using Graph-Based Clusters  Visualizing UML’s Sequence and   Class Diagrams Using Graph-Based Clusters
Visualizing UML’s Sequence and Class Diagrams Using Graph-Based Clusters
 
Mapping and visualization of source code a survey
Mapping and visualization of source code a surveyMapping and visualization of source code a survey
Mapping and visualization of source code a survey
 
Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineering
 
Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...Possibility of interdisciplinary research software engineering andnatural lan...
Possibility of interdisciplinary research software engineering andnatural lan...
 
Possibility of interdisciplinary research software engineering and
Possibility of interdisciplinary research software engineering andPossibility of interdisciplinary research software engineering and
Possibility of interdisciplinary research software engineering and
 
Session on machine translation batu 19 march2016
Session on machine translation batu 19 march2016Session on machine translation batu 19 march2016
Session on machine translation batu 19 march2016
 
Integrating natural language processing and software engineering
Integrating natural language processing and software engineeringIntegrating natural language processing and software engineering
Integrating natural language processing and software engineering
 
Statistical machine translation for indian language copy
Statistical machine translation for indian language   copyStatistical machine translation for indian language   copy
Statistical machine translation for indian language copy
 

Recently uploaded

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 

Recently uploaded (20)

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 

A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code to API Documentation Mapping

  • 1. A Conceptual Dependency Graph Based Keyword Extraction Model for Source Code to API Documentation Mapping Prepared By Nakul Sharma Under Guidance of Dr. Prasanth Yalla Professor, Department of Computer Science and Engineering. Koneru Laxmiah Education Foundation. Vijayawada, Andhra Pradesh India
  • 2. Table of Contents  Introduction  Background  Mathematical Foundations  Genesis of Research  Proposed Methodology  Results and Discussion  Future Scope and Conclusion  References
  • 3. Introduction Traditional key feature extraction techniques • use terms or sentences from the project source codes to form a unique code structure. Almost all traditional document key phrase extraction techniques • represent a document collection as the phrase or sentence matrix in which each row denotes the phrase or sentence-id and corresponding column represents the frequency
  • 4. Introduction (Continued) Main problem with the existing systems is that they ignore the context based textual information. Contextual Information hold more relevance especially when undertaking any software change which effects not just the current phase of project but also the previous phases and the next phases. Source Code Analysis also aids in checking the effect of change on code. In the proposed model, a weighted graph dependency model is used to filter the candidate sets among the vertices for contextual similarity computation.
  • 5. Background • Source Code Analysis • Text Mining • Document Representation • Clustering • NLP/CL
  • 6. Mathematical Framework • Centrality Measures • Document Clustering • Document Metrics • Source Code Metrics
  • 7. Genesis of Research Work Done in Text Mining and its related fields Research conducted by various authors
  • 8. Related Work Sr. No. Name of Authors Work Done in Brief 1 S. Mohammadi et.al new approach is presented to extract the knowledge of dependency between artifacts in the source code. 2 V. U. Gómez, et.al U. Gómez, et.al, proposed a semantic model on the visually characterizing source code modifications 3 S. L. Abebe et.al S. L. Abebe et.al has introduced a new extraction scheme that is sufficiently effective to extract domain concepts from the source code. 4 S. Bajracharya, et al, S. Bajracharya, et al, developed a new SCA framework to collect and analyze open source code on a large scale 5 A. S. Yumaganov A. S. Yumaganov proposed to compare different search models for similarity with limitations on the source code
  • 9. Related Work Sr. No. Name of Authors Work Done in Brief 1 Dimitriou et.al A. Dimitriou et.al, introduced a new keyword search of top-k-size on tree structured data 2 W. Ding W. Ding proposed a review of software documentation process knowledge-based techniques 3 Hussain et. al. Hussain et.al proposed a new software design pattern classification and selection scheme. 4 Ibrahim et. al. Ibrahim et.al presented a scientometric re- ranking technique 5 L. H. Lee et. al. L. H. Lee, et.al, used Bayesian text classification to introduce high relevance keyword extraction process
  • 10. Related Work (Related to Software Metrics) Sr. No. Name of Authors Work Done in Brief 1 Dimitriou et.al A. Dimitriou et.al, introduced a new keyword search of top-k- size on tree structured data 2 W. Ding W. Ding proposed a review of software documentation process knowledge-based techniques 3 Hussain et. al. Hussain et.al proposed a new software design pattern classification and selection scheme. 4 Ibrahim et. al. Ibrahim et.al presented a scientometric re-ranking technique 5 L. H. Lee et. al. L. H. Lee, et.al, used Bayesian text classification to introduce high relevance keyword extraction process
  • 11. Observations on Related Work Large open source projects not considered in SCA systems and tools developed Existing system also do not take into consideration the contextual keyphrases in providing traceability links. The current work proposes an alternative contextual dependency graph based software metrics in form of contextual similarity.
  • 12. Proposed Methodology Figure 1: Module-1 Project source codes Class parsing Project API documentation Text pre-processing Filtered API documents Code dependency Graph Proposed Contextual dependency graph similarity
  • 14. Proposed Methodology Phase 1: Source Code and API documents Pre-processing Step 1: Read project source codes S. Step 2: Read project API documents D. Step 3: for each code Ci in S[] Do Parse source code Ci with methods M and Fields F. Mi=ExtractMethods(Ci) Fi=ExtractFields(Ci) Mapping (Mi , Fi) to Ci C1 (M1,F1) C2 (M2,F2) … ….. Cn (Mn,Fn) done Step 4: // Remove the duplicate methods and fields in each class For each code Ci Do i i j i i j M Pr ob(M M / C);i j F Pr ob(F F / C);i j       If( Mi!=0 AND Fi!=0) Then Remove Mi in Ci or Cj Remove Fi in Ci or Cj End if Done
  • 15. Results and Discussion Project LDA ONTOSE Proposed Method Apache Pluto 0.846 0.835 0.9436 Apache Commons Collections 0.736 0.753 0.879 JEuclid 0.794 0.825 0.962 JFreeChart 0.773 0.874 0.921 Kyro 0.874 0.915 0.948
  • 16. Future Scope and Conclusion The current paper proposed a novel approach to find the relationship between the source code to API documents using the contextual dependency graph. A two pronged approach is used in the proposed method. The project source code is scanned for the relevant metrics. On the other hand, from the API documentation, necessary information is extracted. Here, the dependency graph is used to compute the contextual similarity computation between the source code metrics and its API documents
  • 17. References Amir Hossein Rasekh, Amir Hossein Arshia, “Mining and discovery of hidden relationships between software source codes and related textual Documents”, Digital Scholarship in the Humanities , Published by Oxford University Press on behalf of EADH., doi:10.1093/llc/fqx052, Chun Yong Chong , Sai Peck Lee , Automatic Clustering Constraints Derivation from Object-Oriented Software Using Weighted Complex Network with Graph Theory Analysis, The Journal of Systems & Software (2017), doi: 10.1016/j.jss.2017.08.017 Anh Tuan Nguyen, Tien N. Nguyen, Graph-based Statistical Language Model for Code, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE), 2015, Florence, Italy, Page 858-862. Lars Ackermann, Bernhard Volz, “model[NL]generation: Natural Language Model Extraction”, DCM’13: Proceedings of the 2013 workshop on Domain Specific Modeling: ACM New York,USA. F Meziane, N. Athanasakis, S. Ananiadou, "Generating Natural Lanuage Specifications from UML Class diagrams", Requirement Engineering Journal, 13(1):1-18, Springer-Verlag, London. Fabian Friedrich, Jan Mendling, Frank Puhlmann, “Process Model Generation from Natural Language Text”, In Advanced Information Systems Engineering, Eds. Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg, 482-496.