HUAWEI Technologies1, 2,
Software Fault Injection Test (SFIT) is a test method that injects possible faults into software system to validate system fault-tolerance capability. SFIT is a system-level, module-based software fault test method, and it can validate and discover the defects during software system design and development effectively, and improve the software system reliability. The scope of SFIT includes: the problem of software architecture, the problem of function, the problem during software operating. According to performance phase, SFIT also includes module-level test and product-level test, and their basic principles are same. This paper introduces the position of SFIT in development process and the corresponding relationship with SFMEA. For Software Failure Modes and Effects Analysis (SFMEA), we consider a great deal of failure modes, and analyze their effects. As the closed-loop validation activity of SFMEA, SFIT is used to simulate all sorts of possible software faults resulted from potential defects, and then validate whether the software system in question takes fault-tolerance and protection actions and the validity of these fault-tolerance and protection actions, and validate whether the software reliability of product meets the requirement of design. SFIT needs to fulfill the following three missions:
1. Validate the indefinite problems and disputes during reliability analysis.
2. Regression and validation testing of reliability problem after modifying. All failure modes analyzed during SFMEA activity must be validated by means of SFIT.
3. Taking the design doubts and weakens identified during SFMEA as emphases, perform test using the failure modes from “Failure Mode Repository”, to discover the reliability problems remained after SFMEA activity. The process of SFIT is as follows:
1. Extract the SFIT needs. After brainstorming and review, determine which test items will be executed, and which failure modes will be simulated.
2. Develop the SFIT scheme.
3. Develop the SFIT cases.
4. Execute the SFIT cases.
5. Modify design according to test results. Modify the design systematically according to the effects and priorities of problems. The example fault types that can be simulated and tested using general test method include abnormal operation, abnormal configuration, and input errors. For those fault types or abnormal scenarios that can not be simulated using general test method, some specific SFIT tools can be used. By means of the ideas of CBB and COTS, the common-used fault injection means are modularized, and tools are developed to support automatic SFIT. Thereby, the effectiveness and operability of SFIT are improved dramatically. The main advantages of SFIT are as follows:
1. By doing SFIT on every level, the coverage of test can be increased remarkable, the weaknesses of software system design can be found earlier, the test effort of later phases can be reduced, the put-into-market time of the project can be shorten, and the problems remaining to field operation can be avoided.
2. Find the problems earlier, and optimize the design and implementation earlier.
3. Provide basis for reliability growth and reliability evaluation. This paper introduces in detail the technique and method of Software Fault Injection Test (SFIT) in phases of software development life cycle. A actual object analyzed is given as an example, attaching the corresponding charts and explanatory texts. At last, we introduce our success in the SFIT activity:The “problem weighted index” is decreased by over 25% during system test phase after doing SFIT.