Recovering traceability links between code and documentation
Top Cited Papers
- 10 December 2002
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Software Engineering
- Vol. 28 (10), 970-983
- https://doi.org/10.1109/tse.2002.1041053
Abstract
Software system documentation is almost always expressed informally in natural language and free text. Examples include requirement specifications, design documents, manual pages, system development journals, error logs, and related maintenance reports. We propose a method based on information retrieval to recover traceability links between source code and free text documents. A premise of our work is that programmers use meaningful names for program items, such as functions, variables, types, classes, and methods. We believe that the application-domain knowledge that programmers process when writing the code is often captured by the mnemonics for identifiers; therefore, the analysis of these mnemonics can help to associate high-level concepts with program concepts and vice-versa. We apply both a probabilistic and a vector space information retrieval model in two case studies to trace C++ source code onto manual pages and Java code to functional requirements. We compare the results of applying the two models, discuss the benefits and limitations, and describe directions for improvements.Keywords
This publication has 34 references indexed in Scilit:
- Identifying design-code inconsistencies in object-oriented software: a case studyPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Software reflexion models: bridging the gap between design and implementationIEEE Transactions on Software Engineering, 2001
- Using application understanding to support impact analysisJournal of Software Maintenance: Research and Practice, 1998
- Algorithms on Strings, Trees and SequencesPublished by Cambridge University Press (CUP) ,1997
- Identification of dynamic comprehension processes during large scale maintenanceIEEE Transactions on Software Engineering, 1996
- Learning speech semantics with keyword classification treesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- An information retrieval approach for automatically constructing software librariesIEEE Transactions on Software Engineering, 1991
- On smoothing techniques for bigram-based natural language modellingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1991
- The book paradigm for improved maintenanceIEEE Software, 1990
- gIBIS: a hypertext tool for exploratory policy discussionACM Transactions on Information Systems, 1988