Venera Arnaoudova, Laleh Eshkevari, Rocco Oliveto, Yann-Gaël Guéhéneuc and Giuliano Antoniol
Technical Report (2010)
Open Acess document in PolyPublie and at official publisher |
|
Open Access to the full text of this document Published Version Terms of Use: All rights reserved Download (702kB) |
Abstract
Poorly-chosen identifiers have been reported in the literature as misleading and increasing the program comprehension effort. Identifiers are composed of terms, which can be dictionary words, acronyms, contractions, or simple strings. We conjecture that the use of identical terms in different contexts may increase the risk of faults. We investigate our conjecture using a measure combining term entropy and term context-coverage to study whether certain terms increase the odds ratios of methods to be fault-prone. Entropy measures the physical dispersion of terms in a program: the higher the entropy, the more scattered across the program the terms. Context coverage measures the conceptual dispersion of terms: the higher their context coverage, the more unrelated the methods using them. We compute term entropy and context-coverage of terms extracted from identifiers in Rhino 1.4R3 and ArgoUML 0.16. We show statistically that methods containing terms with high entropy and context-coverage are more fault-prone than others.
Uncontrolled Keywords
Source code identifiers, fault models, program comprehension
Subjects: |
2700 Information technology > 2700 Information technology 2700 Information technology > 2705 Software and development 2700 Information technology > 2706 Software engineering 2700 Information technology > 2720 Computer systems software |
---|---|
Department: | Department of Computer Engineering and Software Engineering |
Funders: | CRSNG/NSERC, Fonds de recherche Nature et technologies Québec |
PolyPublie URL: | https://publications.polymtl.ca/2651/ |
Report number: | EPM-RT-2010-02 |
Date Deposited: | 06 Oct 2017 13:52 |
Last Modified: | 03 Oct 2024 19:03 |
Cite in APA 7: | Arnaoudova, V., Eshkevari, L., Oliveto, R., Guéhéneuc, Y.-G., & Antoniol, G. (2010). An empirical study on the relation between identifiers and fault proneness. (Technical Report n° EPM-RT-2010-02). https://publications.polymtl.ca/2651/ |
---|---|
Statistics
Total downloads
Downloads per month in the last year
Origin of downloads