AutoODC: Automated Generation of Orthogonal Defect Classifications

L. Huang, V. Ng, I. Persing, R. Geng, X. Bai, J. Tian
Orthogonal Defect Classification (ODC), Test Classification, Natural Language Processing

Orthogonal Defect Classification (ODC), the most influential framework for software defect classification and analysis, provides valuable in-process feedback to system development and maintenance. Conducting ODC classification on existing organizational defect reports is human intensive and requires experts' knowledge of both ODC and system domains. This paper presents AutoODC, an approach and tool for automating ODC classification by casting it as a supervised text classification problem. Rather than merely apply the standard machine learning framework to this task, we seek to acquire a better ODC classification system by integrating experts' ODC experience and domain knowledge into the learning process via proposing a novel Relevance Annotation Framework. We evaluated AutoODC on an industrial defect report from the social network domain. AutoODC is a promising approach: not only does it leverage minimal human effort beyond the human annotations typically required by standard machine learning approaches, but it achieves an overall accuracy of 80.2% when using manual classifications as a basis of comparison.

Publish Date: 
Thursday, November 1, 2012
26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011), Lawrence, Kansas