PREDICTING DEFECT CONTENT AND QUALITY ASSURANCE EFFECTIVENESS BY COMBINING EXPERT JUDGMENT AND DEFECT DATA - A CASE STUDY

Michael Kläs1,  Haruka Nakao2,  Frank Elberzhager1,  Jürgen Münch1

Fraunhofer Institute for Experimental Software Engineering1,
Safety & Product Assurance Dep.2,

michael.klaes@iese.fraunhofer.de


Abstract

Planning quality assurance (QA) activities in a systematic way and controlling their execution are challenging tasks for companies that develop software or software-intensive systems. One approach to systematic planning is to consider the expected effectiveness of the planned QA activities and the expected defect content of the checked artifact for assessing the remaining quality risk (i.e., undetected defects). One approach for controlling the execution of QA activities is to monitor whether or not the results are within defined thresholds for the number of defects to be detected. Here, both planning and controlling require estimation capabilities regarding the effectiveness of the applied QA techniques and the defect content of the checked artifacts. Existing approaches for these purposes require extensive measurement data from historical projects. Due to the fact that many companies do not collect enough data for applying these approaches (especially for the early project life-cycle), they typically base their QA planning and controlling solely on expert opinion. This article presents a hybrid method that combines commonly available measurement data and context-specific expert knowledge. To evaluate the method’s applicability and usefulness, we conducted a case study in the context of independent verification and validation activities for critical software in the space domain. A hybrid defect content and effectiveness model was developed for the software requirements analysis phase and evaluated with available legacy data. One major result is that the hybrid model provides improved estimation accuracy when compared to applicable models based solely on data. The mean magnitude of relative error (MMRE) determined by cross-validation is 29.6% compared to 76.5% obtained by the most accurate data-based model.