SlideShare a Scribd company logo
The 1st International Workshop on Natural Language-based Software Engineering (NLBSE ‘22)
Understanding Digits in Identifier Names
An Exploratory Study
Anthony Peruma and Christian D. Newman
Source Code Analysis and Natural Language Lab
SUMMARY
We explore the presence and
purpose of digits in identifier
names through an empirical study
of 800 open-source Java systems
01
BACKGROUND
Identifier names help developers understand the purpose of
the identifier
Names must be unambiguous and intent revealing in
communicating the purpose and behavior of the code
Developers can craft names using a variety of terms making
name consistency challenging
Prior studies focused on the words that make up identifiers,
not digits, such as abbreviations, acronyms, and naming styles.
02
OUR
GOAL
Understand the part played by digits in
identifier names by examining the
structure of names containing digits and
the semantics expressed by the digits.
03
IMPACT
Findings from our study facilitate research
and development of tools to aid in name
recommendation and appraisal.
04
RESEARCH
QUESTIONS
02
How does identifier renaming
operations in the source code
impact the existence of digits in an
identifier's name?
• Volume and characteristics
• Digit preservation
How do developers utilize digits in an
identifier's name to convey meaning?
• Qualitative examination of names
• Taxonomy for the presence of digits
01
05
OUR CONTRIBUTIONS
Taxonomy for
the Presence of
Digits
Patterns &
Trends of Digits
in Names
Discussion of
Research
Challenges
06
STUDY DESIGN
07
Dataset of rename refactorings
from 800 Java projects
(28,079 rename operations)
Extract
Identifiers
With Digits
Identifier
Name Splitting
Rename
Exclusion
Quantitative Analysis Qualitative Analysis
15,424 rename operations
EXPERIMENT RESULTS
08
RQ 1: The treatment of names with digits over time
09
Approach:
● Automated examination of the presence and/or absence of digits in an identifier’s
name before and after a rename operation
Findings:
● Digits are frequently preserved when renamed (e.g., node2 → node3)
○ 43.56% instances preserve digits
○ 33.29% instances remove digits
○ 23.15% instances add digits
● Digit preservation:
○ Most names contain only a single digit in the old & new name
■ 79.93% instances
○ Equal number of digits in the old & new name
■ 91.35% instances
○ The position of the digit is mostly preserved
■ 2nd position in name – 28.73% (e.g., shade2 → shade2Figure)
2 3
RQ 2: The meaning conveyed by digits in a name
10
Approach:
● Manual examination of 375 rename instances by the authors (stratified statistically
significant sample)
○ Includes reviewing the surrounding code
○ Snowballing to locate examples of additional instances in the original dataset
Findings:
● Taxonomy of 6 categories showing how digits convey meaning in a name:
○ Auto-Generated
○ Distinguisher
○ Synonym
○ Version Number
○ Specification
○ Domain/Technology
RQ 2: The meaning conveyed by digits in a name
10
Auto-Generated
• Created by a code generation tool,
or IDE; not easily comprehensible
• Numbers may have a meaning
based on the generation technique
• E.g., LA18_6
Distinguisher
• Usually the last token in the name
• At least two identifiers having a
lexically identical name
• Avoid name collision at compilation
• E.g., auditLog3
Synonym
• At least one digit utilized in place of
a word
• The numbers 2 and 4 are very
common example
• E.g., convert2RList
Domain/Technology
• The digit that is part of the name of
a domain term or technology
• Digits themselves have no
individual meaning
• E.g., slf4jLogLevel
Specification
• Represents a specification
• Acts as a way to uniquely identify
concepts, behaviors, or
characteristics.
• E.g., arialRegular9Dark
Version Number
• Digit used to signify a version
number
• Indicates significant capabilities and
limitations or the identifier
• E.g., V1DozerTransformModel
DISCUSSION &
CONCLUSION
08
KEY CHALLENGES
● Auto-generated code can skew findings;
○ Most likely you will need to run multiple iterations of your data
collection/extraction process to isolate auto-generated identifiers
● The volume of auto-generated names can hinder data sampling activities as they
may comprise of the majority of identifiers in the code
○ This can vary depending on the type of auto-generated code the project utilizes
● Automatically detecting auto-generated identifiers is not straightforward
○ Heuristics can help, but only partly and is human dependent
● Numbers can have different interpretations; with limited research in this specific
area, we don't know if numbers hinder or help comprehension
KEY TAKEAWAYS
12
Digits Are Preserved
Post-Rename
01
The Digits Found in
Identifier Names Are
Meaningful
02
Improve identifier name appraisals and
recommendations when developers
perform rename operations
Utilize static analysis to determine if the
digit is related to the code
Build a catalog of technologies,
standards, or domain terms in the project
IDENTIFIER NAMING STRUCTURE CATALOGUE
A resource about what is scientifically known about naming identifiers
13
Part-of-Speech Tagset Linguistic Terminology
Linguistic Antipatterns Common Naming Structures
Naming Styles
Available at: h t t p s : / / w w w . s c a n l . o r g
THANK
YOU!
For more of what we do, visit:
https://www.scanl.org/

More Related Content

Similar to Understanding Digits in Identifier Names: An Exploratory Study

Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!
Diego Pacheco
 
Unit 4 plsql
Unit 4  plsqlUnit 4  plsql
Unit 4 plsql
DrkhanchanaR
 
Introduction to C3.net Architecture unit
Introduction to C3.net Architecture unitIntroduction to C3.net Architecture unit
Introduction to C3.net Architecture unit
Kotresh Munavallimatt
 
CSE 1201: Structured Programming Language
CSE 1201: Structured Programming LanguageCSE 1201: Structured Programming Language
CSE 1201: Structured Programming Language
Zubayer Farazi
 
Clean code: meaningful Name
Clean code: meaningful NameClean code: meaningful Name
Clean code: meaningful Namenahid035
 
Data Type in C Programming
Data Type in C ProgrammingData Type in C Programming
Data Type in C Programming
Qazi Shahzad Ali
 
Cp presentation
Cp presentationCp presentation
Cp presentation
MeetaPrajapati
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo clean
Hector Canto
 
Natural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachNatural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning Approach
Minhazul Arefin
 
OOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdfOOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdf
HouseMusica
 
INTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGE
INTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGEINTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGE
INTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGE
RathnaM16
 
classVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptxclassVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptx
MaaReddySanjiv
 
classVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptxclassVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptx
ssusere336f4
 
SE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPTSE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPT
nikshaikh786
 
Javascript Programming according to Industry Standards.pptx
Javascript Programming according to Industry Standards.pptxJavascript Programming according to Industry Standards.pptx
Javascript Programming according to Industry Standards.pptx
MukundSonaiya1
 
The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)
Ki Sung Bae
 
The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)
Ki Sung Bae
 
Algorithms-Flowcharts-Data-Types-and-Pseudocodes.pptx
Algorithms-Flowcharts-Data-Types-and-Pseudocodes.pptxAlgorithms-Flowcharts-Data-Types-and-Pseudocodes.pptx
Algorithms-Flowcharts-Data-Types-and-Pseudocodes.pptx
RobertCarreonBula
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Neo4j
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀
GDSCNiT
 

Similar to Understanding Digits in Identifier Names: An Exploratory Study (20)

Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!Naming Things Book : Simple Book Review!
Naming Things Book : Simple Book Review!
 
Unit 4 plsql
Unit 4  plsqlUnit 4  plsql
Unit 4 plsql
 
Introduction to C3.net Architecture unit
Introduction to C3.net Architecture unitIntroduction to C3.net Architecture unit
Introduction to C3.net Architecture unit
 
CSE 1201: Structured Programming Language
CSE 1201: Structured Programming LanguageCSE 1201: Structured Programming Language
CSE 1201: Structured Programming Language
 
Clean code: meaningful Name
Clean code: meaningful NameClean code: meaningful Name
Clean code: meaningful Name
 
Data Type in C Programming
Data Type in C ProgrammingData Type in C Programming
Data Type in C Programming
 
Cp presentation
Cp presentationCp presentation
Cp presentation
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo clean
 
Natural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning ApproachNatural Language Query to SQL conversion using Machine Learning Approach
Natural Language Query to SQL conversion using Machine Learning Approach
 
OOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdfOOP lesson1 and Variables.pdf
OOP lesson1 and Variables.pdf
 
INTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGE
INTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGEINTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGE
INTRODUCTION TO CODING-CLASS VI LEVEL-DESCRIPTION ABOUT SYNTAX LANGUAGE
 
classVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptxclassVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptx
 
classVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptxclassVI_Coding_Teacher_Presentation.pptx
classVI_Coding_Teacher_Presentation.pptx
 
SE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPTSE-IT JAVA LAB OOP CONCEPT
SE-IT JAVA LAB OOP CONCEPT
 
Javascript Programming according to Industry Standards.pptx
Javascript Programming according to Industry Standards.pptxJavascript Programming according to Industry Standards.pptx
Javascript Programming according to Industry Standards.pptx
 
The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)
 
The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)The art of readable code (ch1~ch4)
The art of readable code (ch1~ch4)
 
Algorithms-Flowcharts-Data-Types-and-Pseudocodes.pptx
Algorithms-Flowcharts-Data-Types-and-Pseudocodes.pptxAlgorithms-Flowcharts-Data-Types-and-Pseudocodes.pptx
Algorithms-Flowcharts-Data-Types-and-Pseudocodes.pptx
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
 
MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀MachinaFiesta: A Vision into Machine Learning 🚀
MachinaFiesta: A Vision into Machine Learning 🚀
 

More from University of Hawai‘i at Mānoa

Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...
Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...
Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...
University of Hawai‘i at Mānoa
 
A Primer on High-Quality Identifier Naming [ASE 2022]
A Primer on High-Quality Identifier Naming [ASE 2022]A Primer on High-Quality Identifier Naming [ASE 2022]
A Primer on High-Quality Identifier Naming [ASE 2022]
University of Hawai‘i at Mānoa
 
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
University of Hawai‘i at Mānoa
 
A Primer on High-Quality Identifier Naming
A Primer on High-Quality Identifier NamingA Primer on High-Quality Identifier Naming
A Primer on High-Quality Identifier Naming
University of Hawai‘i at Mānoa
 
Test Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to DetectionTest Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to Detection
University of Hawai‘i at Mānoa
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
University of Hawai‘i at Mānoa
 
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
University of Hawai‘i at Mānoa
 
IDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal ToolIDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal Tool
University of Hawai‘i at Mānoa
 
Using Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUsing Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name Evolution
University of Hawai‘i at Mānoa
 
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
University of Hawai‘i at Mānoa
 
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
University of Hawai‘i at Mānoa
 
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
University of Hawai‘i at Mānoa
 
A Preliminary Study of Android Refactorings
A Preliminary Study of Android RefactoringsA Preliminary Study of Android Refactorings
A Preliminary Study of Android Refactorings
University of Hawai‘i at Mānoa
 
Permission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory StudyPermission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory Study
University of Hawai‘i at Mānoa
 
A Career In IT
A Career In ITA Career In IT
Web Content Management - Introduction
Web Content Management - IntroductionWeb Content Management - Introduction
Web Content Management - Introduction
University of Hawai‘i at Mānoa
 
Introduction to SignalR
Introduction to SignalRIntroduction to SignalR
Introduction to SignalR
University of Hawai‘i at Mānoa
 
SharePoint 2013 - Search Driven Publishing
SharePoint 2013 - Search Driven PublishingSharePoint 2013 - Search Driven Publishing
SharePoint 2013 - Search Driven Publishing
University of Hawai‘i at Mānoa
 

More from University of Hawai‘i at Mānoa (19)

Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...
Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...
Rename Chains: An Exploratory Study on the Occurrence and Characteristics of ...
 
A Primer on High-Quality Identifier Naming [ASE 2022]
A Primer on High-Quality Identifier Naming [ASE 2022]A Primer on High-Quality Identifier Naming [ASE 2022]
A Primer on High-Quality Identifier Naming [ASE 2022]
 
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...Preparing for the  Academic Job Market:  Experience and Tips from  a Recent F...
Preparing for the Academic Job Market: Experience and Tips from a Recent F...
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
 
A Primer on High-Quality Identifier Naming
A Primer on High-Quality Identifier NamingA Primer on High-Quality Identifier Naming
A Primer on High-Quality Identifier Naming
 
Test Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to DetectionTest Anti-Patterns: From Definition to Detection
Test Anti-Patterns: From Definition to Detection
 
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
Refactoring Debt: Myth or Reality? An Exploratory Study on the Relationship B...
 
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
How Do I Refactor This? An Empirical Study on Refactoring Trends and Topics i...
 
IDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal ToolIDEAL: An Open-Source Identifier Name Appraisal Tool
IDEAL: An Open-Source Identifier Name Appraisal Tool
 
Using Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name EvolutionUsing Grammar Patterns to Interpret Test Method Name Evolution
Using Grammar Patterns to Interpret Test Method Name Evolution
 
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Explorator...
 
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
An Exploratory Study on the Refactoring of Unit Test Files in Android Applica...
 
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...On the Distribution of Test Smells in Open Source Android Applications: An Ex...
On the Distribution of Test Smells in Open Source Android Applications: An Ex...
 
A Preliminary Study of Android Refactorings
A Preliminary Study of Android RefactoringsA Preliminary Study of Android Refactorings
A Preliminary Study of Android Refactorings
 
Permission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory StudyPermission Issues in Open-Source Android Apps: An Exploratory Study
Permission Issues in Open-Source Android Apps: An Exploratory Study
 
A Career In IT
A Career In ITA Career In IT
A Career In IT
 
Web Content Management - Introduction
Web Content Management - IntroductionWeb Content Management - Introduction
Web Content Management - Introduction
 
Introduction to SignalR
Introduction to SignalRIntroduction to SignalR
Introduction to SignalR
 
SharePoint 2013 - Search Driven Publishing
SharePoint 2013 - Search Driven PublishingSharePoint 2013 - Search Driven Publishing
SharePoint 2013 - Search Driven Publishing
 

Recently uploaded

SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
mz5nrf0n
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 

Recently uploaded (20)

SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 

Understanding Digits in Identifier Names: An Exploratory Study

  • 1. The 1st International Workshop on Natural Language-based Software Engineering (NLBSE ‘22) Understanding Digits in Identifier Names An Exploratory Study Anthony Peruma and Christian D. Newman Source Code Analysis and Natural Language Lab
  • 2. SUMMARY We explore the presence and purpose of digits in identifier names through an empirical study of 800 open-source Java systems 01
  • 3. BACKGROUND Identifier names help developers understand the purpose of the identifier Names must be unambiguous and intent revealing in communicating the purpose and behavior of the code Developers can craft names using a variety of terms making name consistency challenging Prior studies focused on the words that make up identifiers, not digits, such as abbreviations, acronyms, and naming styles. 02
  • 4. OUR GOAL Understand the part played by digits in identifier names by examining the structure of names containing digits and the semantics expressed by the digits. 03
  • 5. IMPACT Findings from our study facilitate research and development of tools to aid in name recommendation and appraisal. 04
  • 6. RESEARCH QUESTIONS 02 How does identifier renaming operations in the source code impact the existence of digits in an identifier's name? • Volume and characteristics • Digit preservation How do developers utilize digits in an identifier's name to convey meaning? • Qualitative examination of names • Taxonomy for the presence of digits 01 05
  • 7. OUR CONTRIBUTIONS Taxonomy for the Presence of Digits Patterns & Trends of Digits in Names Discussion of Research Challenges 06
  • 8. STUDY DESIGN 07 Dataset of rename refactorings from 800 Java projects (28,079 rename operations) Extract Identifiers With Digits Identifier Name Splitting Rename Exclusion Quantitative Analysis Qualitative Analysis 15,424 rename operations
  • 10. RQ 1: The treatment of names with digits over time 09 Approach: ● Automated examination of the presence and/or absence of digits in an identifier’s name before and after a rename operation Findings: ● Digits are frequently preserved when renamed (e.g., node2 → node3) ○ 43.56% instances preserve digits ○ 33.29% instances remove digits ○ 23.15% instances add digits ● Digit preservation: ○ Most names contain only a single digit in the old & new name ■ 79.93% instances ○ Equal number of digits in the old & new name ■ 91.35% instances ○ The position of the digit is mostly preserved ■ 2nd position in name – 28.73% (e.g., shade2 → shade2Figure) 2 3
  • 11. RQ 2: The meaning conveyed by digits in a name 10 Approach: ● Manual examination of 375 rename instances by the authors (stratified statistically significant sample) ○ Includes reviewing the surrounding code ○ Snowballing to locate examples of additional instances in the original dataset Findings: ● Taxonomy of 6 categories showing how digits convey meaning in a name: ○ Auto-Generated ○ Distinguisher ○ Synonym ○ Version Number ○ Specification ○ Domain/Technology
  • 12. RQ 2: The meaning conveyed by digits in a name 10 Auto-Generated • Created by a code generation tool, or IDE; not easily comprehensible • Numbers may have a meaning based on the generation technique • E.g., LA18_6 Distinguisher • Usually the last token in the name • At least two identifiers having a lexically identical name • Avoid name collision at compilation • E.g., auditLog3 Synonym • At least one digit utilized in place of a word • The numbers 2 and 4 are very common example • E.g., convert2RList Domain/Technology • The digit that is part of the name of a domain term or technology • Digits themselves have no individual meaning • E.g., slf4jLogLevel Specification • Represents a specification • Acts as a way to uniquely identify concepts, behaviors, or characteristics. • E.g., arialRegular9Dark Version Number • Digit used to signify a version number • Indicates significant capabilities and limitations or the identifier • E.g., V1DozerTransformModel
  • 14. KEY CHALLENGES ● Auto-generated code can skew findings; ○ Most likely you will need to run multiple iterations of your data collection/extraction process to isolate auto-generated identifiers ● The volume of auto-generated names can hinder data sampling activities as they may comprise of the majority of identifiers in the code ○ This can vary depending on the type of auto-generated code the project utilizes ● Automatically detecting auto-generated identifiers is not straightforward ○ Heuristics can help, but only partly and is human dependent ● Numbers can have different interpretations; with limited research in this specific area, we don't know if numbers hinder or help comprehension
  • 15. KEY TAKEAWAYS 12 Digits Are Preserved Post-Rename 01 The Digits Found in Identifier Names Are Meaningful 02 Improve identifier name appraisals and recommendations when developers perform rename operations Utilize static analysis to determine if the digit is related to the code Build a catalog of technologies, standards, or domain terms in the project
  • 16. IDENTIFIER NAMING STRUCTURE CATALOGUE A resource about what is scientifically known about naming identifiers 13 Part-of-Speech Tagset Linguistic Terminology Linguistic Antipatterns Common Naming Structures Naming Styles Available at: h t t p s : / / w w w . s c a n l . o r g
  • 17. THANK YOU! For more of what we do, visit: https://www.scanl.org/