MEASURING THE FIELD RELIABILITY OF SOFTWARE VERSIONS

Robert Mullen

Cisco

bomullen@cisco.com


Abstract

Improving reliability and quality in a systematic way begins with having a normalized metric. At ISSRE 2005 we described experience at Cisco Systems, Inc. with such a normalized metric for software, SWDPM (Software Defects per Million) That metric focused on determining the software problem rate in the field for specific hardware platforms for which the number of units shipped was known. This evolved into the SPR metric subsequently adopted as a standard by the QUEST forum to enable Telcos to compare vendors’ products.

In 2005 we identified three problems with SWDPM. (1) We needed to extend it to software-only products by defining a different denominator. (2) The method of normalizing by hardware platforms did not allow us to generate SWDPM trends for different versions of software. (3) We needed to inject the metric into our ordinary process as it became better understood. The first two greatly impeded widespread use and therefore delayed progress against the third as well. Without visiblity of the reliability of software products at the version level we don’t know whether we are making progress or not. Here we solve those problems.

We present a new, practical method of estimating the usage of software versions. The method works whether or not the software versions are tied to hardware platforms, solving the first two problems above. We compare the method’s convenience and accuracy to alternatives. We present examples of the insights that can be gained regarding the reliability of related versions of software as well as the reliability of the related software on different hardware platforms.

Finally we describe how the data and insights are being absorbed into our release process as well as their application to ad hoc questions arising during release management.