SlideShare a Scribd company logo
Improving Code Readability
Models with Textual Features
Simone Scalabrino, Mario Linares-Vásquez, Denys Poshyvanyk, Rocco Oliveto
Software maintenance
accounts for
70%of the costs of a project
for (int i=1;i<=100;i++) {
int a=((528>>i%15-1)&1)*4;
int b=((-2128340926>>(i%15)*2)&3)*4;
System.out.println("FizzBuzz".substring(a,b)+(a==b?i:""));
}
What does it do?
Not really clear...
for (int i=1;i<=100;i++) {
String fizzBuzz = “”;
if (i % 3 == 0)
fizzBuzz += “Fizz”;
if (i % 5 == 0)
fizzBuzz += “Buzz”;
if (fizzBuzz.isEmpty())
fizzBuzz += i;
System.out.println(fizzBuzz);
}
What about this one?
Why is it easier
to understand?
Code readability
Code readability prediction
Source code
Source code
Source codeSource code
Features
Source code
Source codeSource code
Features
Training set Machine learner
Source code
Source codeSource code
Features
Readable Unreadable
Training set Machine learner
Code readability prediction
Code readability prediction
Code readability prediction
Code readability prediction
Structural features
Code readability prediction
Structural features
Visual features
Code readability prediction
Structural features
Visual features
Two datasets
Something is missing...
Code is text!
New features
Comments readability
Comments readability
Flesch-Kincaid index
Comments and
identifiers consistency
Comments and
identifiers consistency
Overlap between terms in comments
and terms in identifiers
Identifier terms
in dictionary
Identifier terms
in dictionary
Percentage of correct terms in identifiers
Narrow meaning
identifiers
Narrow meaning
identifiers
“Sum” more specific than “computation”
Number of meanings
Number of meanings
“Present” more ambiguous than “Attendee”
Textual coherence
Textual coherence
Consistency between pairs of syntactic blocks
Textual features
Source code
Source codeSource code
Features
Readable Unreadable
Training set Logistic regression
Case study
New datasetNew dataset
200
Java snippets
9
annotators
Do textual features complement
the others proposed
in the literature?
Overlap metrics
Textual Features vs Buse’s
Textual Features vs Posnett’s
Textual Features vs Dorn’s
Textual Features vs Dorn’s
Readability of 12%-21% of snippets can
be explained only using textual features
What is the accuracy of a
readability model based
on structural and
textual features?
79%
81%
78%
80%
74%
Dataset by Buse and Weimer
ALL-F: All features
T-F: Textual features
D-F: Dorn’s features
P-F: Posnett’s features
BW-F: Buse’s features
84%
79%
73%
80%
77%
Dataset by Dorn
ALL-F: All features
T-F: Textual features
D-F: Dorn’s features
P-F: Posnett’s features
BW-F: Buse’s features
80%
71%
66%
76%
68%
New dataset
ALL-F: All features
T-F: Textual features
D-F: Dorn’s features
P-F: Posnett’s features
BW-F: Buse’s features
80%
71%
66%
76%
68%
New dataset
ALL-F: All features
T-F: Textual features
D-F: Dorn’s features
P-F: Posnett’s features
BW-F: Buse’s features
A model which includes all features
achieves an higher accuracy on 2 datasets
In summary...
Code readability for
defect prediction
Thanks.

More Related Content

Similar to Improving code readability models with textual features

Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineering
jtdudley
 
Why Rust? - Matthias Endler - Codemotion Amsterdam 2016
Why Rust? - Matthias Endler - Codemotion Amsterdam 2016Why Rust? - Matthias Endler - Codemotion Amsterdam 2016
Why Rust? - Matthias Endler - Codemotion Amsterdam 2016
Codemotion
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
Nate Abele
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
Nate Abele
 
200612_BioPackathon_ss
200612_BioPackathon_ss200612_BioPackathon_ss
200612_BioPackathon_ss
Satoshi Kume
 
VerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfVerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdf
Amzath3
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
Roberto Natella
 
Visual Studio .NET2010
Visual Studio .NET2010Visual Studio .NET2010
Visual Studio .NET2010Satish Verma
 
Review unknown code with static analysis Zend con 2017
Review unknown code with static analysis  Zend con 2017Review unknown code with static analysis  Zend con 2017
Review unknown code with static analysis Zend con 2017
Damien Seguy
 
MSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for Developers
MSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for DevelopersMSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for Developers
MSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for Developers
Dave Bost
 
Ruby on Rails 3.1: Let's bring the fun back into web programing
Ruby on Rails 3.1: Let's bring the fun back into web programingRuby on Rails 3.1: Let's bring the fun back into web programing
Ruby on Rails 3.1: Let's bring the fun back into web programing
Bozhidar Batsov
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAdam Getchell
 
Swifty Methods, Swifty APIs
Swifty Methods, Swifty APIsSwifty Methods, Swifty APIs
Swifty Methods, Swifty APIs
radexp
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann
 
Software Language Design & Engineering: Mobl & Spoofax
Software Language Design & Engineering: Mobl & SpoofaxSoftware Language Design & Engineering: Mobl & Spoofax
Software Language Design & Engineering: Mobl & Spoofax
Eelco Visser
 
BPjs deep dive 2019
BPjs deep dive 2019BPjs deep dive 2019
BPjs deep dive 2019
Michael Bar-Sinai
 
CS4200 2019 | Lecture 2 | syntax-definition
CS4200 2019 | Lecture 2 | syntax-definitionCS4200 2019 | Lecture 2 | syntax-definition
CS4200 2019 | Lecture 2 | syntax-definition
Eelco Visser
 
A Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query LanguagesA Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query Languages
Kim Mens
 
Porting VisualWorks code to Pharo
Porting VisualWorks code to PharoPorting VisualWorks code to Pharo
Porting VisualWorks code to Pharo
ESUG
 

Similar to Improving code readability models with textual features (20)

Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineering
 
Why Rust? - Matthias Endler - Codemotion Amsterdam 2016
Why Rust? - Matthias Endler - Codemotion Amsterdam 2016Why Rust? - Matthias Endler - Codemotion Amsterdam 2016
Why Rust? - Matthias Endler - Codemotion Amsterdam 2016
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
 
Measuring Your Code 2.0
Measuring Your Code 2.0Measuring Your Code 2.0
Measuring Your Code 2.0
 
200612_BioPackathon_ss
200612_BioPackathon_ss200612_BioPackathon_ss
200612_BioPackathon_ss
 
VerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdfVerticaPy_original - Anritsu.pdf
VerticaPy_original - Anritsu.pdf
 
Dependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
 
Visual Studio .NET2010
Visual Studio .NET2010Visual Studio .NET2010
Visual Studio .NET2010
 
Review unknown code with static analysis Zend con 2017
Review unknown code with static analysis  Zend con 2017Review unknown code with static analysis  Zend con 2017
Review unknown code with static analysis Zend con 2017
 
MSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for Developers
MSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for DevelopersMSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for Developers
MSDN Presents: Visual Studio 2010, .NET 4, SharePoint 2010 for Developers
 
Ruby on Rails 3.1: Let's bring the fun back into web programing
Ruby on Rails 3.1: Let's bring the fun back into web programingRuby on Rails 3.1: Let's bring the fun back into web programing
Ruby on Rails 3.1: Let's bring the fun back into web programing
 
An Overview Of Python With Functional Programming
An Overview Of Python With Functional ProgrammingAn Overview Of Python With Functional Programming
An Overview Of Python With Functional Programming
 
Swifty Methods, Swifty APIs
Swifty Methods, Swifty APIsSwifty Methods, Swifty APIs
Swifty Methods, Swifty APIs
 
Go之道
Go之道Go之道
Go之道
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Software Language Design & Engineering: Mobl & Spoofax
Software Language Design & Engineering: Mobl & SpoofaxSoftware Language Design & Engineering: Mobl & Spoofax
Software Language Design & Engineering: Mobl & Spoofax
 
BPjs deep dive 2019
BPjs deep dive 2019BPjs deep dive 2019
BPjs deep dive 2019
 
CS4200 2019 | Lecture 2 | syntax-definition
CS4200 2019 | Lecture 2 | syntax-definitionCS4200 2019 | Lecture 2 | syntax-definition
CS4200 2019 | Lecture 2 | syntax-definition
 
A Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query LanguagesA Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query Languages
 
Porting VisualWorks code to Pharo
Porting VisualWorks code to PharoPorting VisualWorks code to Pharo
Porting VisualWorks code to Pharo
 

Recently uploaded

basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Soumen Santra
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
zwunae
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
Divyam548318
 
AIR POLLUTION lecture EnE203 updated.pdf
AIR POLLUTION lecture EnE203 updated.pdfAIR POLLUTION lecture EnE203 updated.pdf
AIR POLLUTION lecture EnE203 updated.pdf
RicletoEspinosa1
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
obonagu
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
ssuser36d3051
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
yokeleetan1
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptxTOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
nikitacareer3
 
Water billing management system project report.pdf
Water billing management system project report.pdfWater billing management system project report.pdf
Water billing management system project report.pdf
Kamal Acharya
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 

Recently uploaded (20)

basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTSHeap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
 
bank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdfbank management system in java and mysql report1.pdf
bank management system in java and mysql report1.pdf
 
AIR POLLUTION lecture EnE203 updated.pdf
AIR POLLUTION lecture EnE203 updated.pdfAIR POLLUTION lecture EnE203 updated.pdf
AIR POLLUTION lecture EnE203 updated.pdf
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
sieving analysis and results interpretation
sieving analysis and results interpretationsieving analysis and results interpretation
sieving analysis and results interpretation
 
Swimming pool mechanical components design.pptx
Swimming pool  mechanical components design.pptxSwimming pool  mechanical components design.pptx
Swimming pool mechanical components design.pptx
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptxTOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
TOP 10 B TECH COLLEGES IN JAIPUR 2024.pptx
 
Water billing management system project report.pdf
Water billing management system project report.pdfWater billing management system project report.pdf
Water billing management system project report.pdf
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 

Improving code readability models with textual features