<  Back to the Polytechnique Montréal portal

An empirical study on the relation between identifiers and fault proneness

Venera Arnaoudova, Laleh Eshkevari, Rocco Oliveto, Yann-Gaël Guéhéneuc, Giuliano Antoniol

Technical Report (2010)

Open Acess document in PolyPublie and at official publisher
Open Access to the full text of this document
Published Version
Terms of Use: All rights reserved
Download (702kB)
Show abstract
Hide abstract


Poorly-chosen identifiers have been reportedin the literature as misleading and increasing the programcomprehension effort. Identifiers are composed of terms,which can be dictionary words, acronyms, contractions, orsimple strings. We conjecture that the use of identical termsin different contexts may increase the risk of faults. Weinvestigate our conjecture using a measure combining termentropy and term context-coverage to study whether certainterms increase the odds ratios of methods to be fault-prone.Entropy measures the physical dispersion of termsin a program: the higher the entropy, the more scatteredacross the program the terms. Context coverage measuresthe conceptual dispersion of terms: the higher their contextcoverage, the more unrelated the methods using them.We compute term entropy and context-coverage of termsextracted from identifiers in Rhino 1.4R3 and ArgoUML0.16. We show statistically that methods containing termswith high entropy and context-coverage are more fault-pronethan others.

Uncontrolled Keywords

Source code identifiers, fault models, program comprehension

Subjects: 2700 Information technology > 2700 Information technology
2700 Information technology > 2705 Software and development
2700 Information technology > 2706 Software engineering
2700 Information technology > 2720 Computer systems software
Department: Department of Computer Engineering and Software Engineering
Funders: CRSNG/NSERC, Fonds de recherche Nature et technologies Québec
PolyPublie URL: https://publications.polymtl.ca/2651/
Report number: EPM-RT-2010-02
Date Deposited: 06 Oct 2017 13:52
Last Modified: 11 Nov 2022 08:22
Cite in APA 7: Arnaoudova, V., Eshkevari, L., Oliveto, R., Guéhéneuc, Y.-G., & Antoniol, G. (2010). An empirical study on the relation between identifiers and fault proneness (Technical Report n° EPM-RT-2010-02). https://publications.polymtl.ca/2651/


Total downloads

Downloads per month in the last year

Origin of downloads

Repository Staff Only

View Item View Item