Selecting fault revealing mutants

T Titcheu Chekam, M Papadakis, TF Bissyandé… - Empirical Software …, 2020 - Springer
Empirical Software Engineering, 2020Springer
Mutant selection refers to the problem of choosing, among a large number of mutants, the
(few) ones that should be used by the testers. In view of this, we investigate the problem of
selecting the fault revealing mutants, ie, the mutants that are killable and lead to test cases
that uncover unknown program faults. We formulate two variants of this problem: the fault
revealing mutant selection and the fault revealing mutant prioritization. We argue and show
that these problems can be tackled through a set of 'static'program features and propose a …
Abstract
Mutant selection refers to the problem of choosing, among a large number of mutants, the (few) ones that should be used by the testers. In view of this, we investigate the problem of selecting the fault revealing mutants, i.e., the mutants that are killable and lead to test cases that uncover unknown program faults. We formulate two variants of this problem: the fault revealing mutant selection and the fault revealing mutant prioritization. We argue and show that these problems can be tackled through a set of ‘static’ program features and propose a machine learning approach, named FaRM, that learns to select and rank killable and fault revealing mutants. Experimental results involving 1,692 real faults show the practical benefits of our approach in both examined problems. Our results show that FaRM achieves a good trade-off between application cost and effectiveness (measured in terms of faults revealed). We also show that FaRM outperforms all the existing mutant selection methods, i.e., the random mutant sampling, the selective mutation and defect prediction (mutating the code areas pointed by defect prediction). In particular, our results show that with respect to mutant selection, our approach reveals 23% to 34% more faults than any of the baseline methods, while, with respect to mutant prioritization, it achieves higher average percentage of revealed faults with a median difference between 4% and 9% (from the random mutant orderings).
Springer