In fault diagnosis an important performance metric is the accuracy of the multiple-fault diagnosis D returned by a diagnostic algorithm (DA). In many cases diagnostic accuracy is insufficient (D comprises too many candidates). Diagnostic uncertainty can be reduced by supplying more modeling information, and more observation information. Observation information can be in spatial sense (more probing points), and in temporal sense (more tests), the latter of which is the focus of this paper. Ignoring the computational cost of the DA itself, the diagnostic process may be viewed as (1) expending costs to generate observation input to the DA (testing cost Ct), and (2) expending costs to find the actual diagnosis from D returned by the DA by the diagnostician (diagnostic cost Cd). The latter is inverse to diagnostic accuracy (utility) as false positives and negatives in D result in wasted effort by the diagnostician finding the actually faulted components.
In sequential diagnosis (SD) one computes the optimal sequence of tests (observations) that, on average, minimizes Ct and Cd. SD approaches are typically found in the (hardware) systems testing domain where system knowledge is encoded in terms of predetermined test matrices, or in the model-based diagnosis domain where system models are available. In the software domain, the domain considered in this paper, the problem of diagnosis has received considerable attention. Due to complexity problems most model-based approaches are limited to small programs and/or must take a single-fault assumption. Typically, statistical approaches are used, that abstract from the program code and compute the correlation between component involvement in tests on the one hand (expressed in terms of a so-called spectra), and the test outcomes (pass/fail) on the other, deriving a ranking of suspect components. While these approaches essentially involve multiple observations (e.g., an entire regression test suite) they do not qualify as SD since the choice of tests is random (a passive diagnosis ap-
proach), rather than generating/selecting the next best test based on the diagnosis obtained thus far (an active diagnosis approach). Hence, the decrease of Cd per test is far from optimal.
In this paper we present a spectrum-based SD approach and an associated algorithm dubbed SEQUOIA
(SEQUencing fOr dIAgnosis) that greedily computes a sequence out of a large set of given tests that delivers near-optimal diagnostic performance in terms of the decay of Cd as function of the number of tests. Unlike statistical techniques, which have insufficient diagnostic accuracy to select the next best test, we use an approximate, Bayesian reasoning approach which provides the required accuracy. Similar to approaches based on a test coverage matrix, the tests are selected from a fixed test set. Unlike multiple-fault, matrix-based approaches, however, our approximate, Baysian approach can handle large system sizes and test set sizes. Furthermore, unlike MBD approaches, no system information is required, other than component test coverage, which is typically known from test execution profiling.
To achieve polynomial complexity SEQUOIA uses an approximate information gain computation approach. Synthetic data shows, that the dynamic selection of the next best test based on the observations made so far, allows SEQUOIA to achieve much better decay of diagnostic uncertainty compared to random test sequencing. Real programs, taken from the Siemens set, also show that SEQUOIA has better performance, except when the diagnosis involves ambiguity sets that are too large for the entropy estimation to handle. Future work, therefore, includes solving this problem by a better expansion technique, expanding the MHS in order of posterior probability (which is currently ignored in our prototype approach).
How to Cite
Testing, multiple faults, sequential diagnosis
The Prognostic and Health Management Society advocates open-access to scientific data and uses a Creative Commons license for publishing and distributing any papers. A Creative Commons license does not relinquish the author’s copyright; rather it allows them to share some of their rights with any member of the public under certain conditions whilst enjoying full legal protection. By submitting an article to the International Conference of the Prognostics and Health Management Society, the authors agree to be bound by the associated terms and conditions including the following:
As the author, you retain the copyright to your Work. By submitting your Work, you are granting anybody the right to copy, distribute and transmit your Work and to adapt your Work with proper attribution under the terms of the Creative Commons Attribution 3.0 United States license. You assign rights to the Prognostics and Health Management Society to publish and disseminate your Work through electronic and print media if it is accepted for publication. A license note citing the Creative Commons Attribution 3.0 United States License as shown below needs to be placed in the footnote on the first page of the article.
First Author et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.