Smarter Testing of Evolving Software Systems

The overall aim of the project is to improve verification and & validation (V&V) using data-driven techniques that exploit current and historical data obtained from system development and execution.

The initial goals of the project are to use historical development data to improve detail in regression testing. Regression testing is the testing activity that is performed after changes have been made to an existing software system to give confidence that the changes that were introduced do not negatively affect the behavior of the unchanged parts of the software (the correctness of the newly introduced changes needs to be tested separately). Regression testing is generally performed by executing test suites on the software system under the assumption that the tests should have similar outcomes before and after the changes were made, in order to ensure that the changes did not have negative effects.

These test suites tend to grow in size as the software evolves. Moreover, certain parts of the test suite may become outdated over time, or they might start to overlap with other parts. The overall result is that it becomes too costly to execute the entire test suites available for a system (a conservative approach known as the retest-all approach).

The project investigates smarter testing techniques that help to maximize the value of the available test suites while reducing the costs of testing and maintaining high coverage and fault detection properties. The principal underlying idea in this project is to drive these techniques based on an analysis of (trends in) historical data about changes that are made, and characteristics that are observed, in successive releases of the system. One of the main research challenges will be devising techniques to establish correct and detailed traceability links between the software artifacts (requirements, designs, code) on the one side and testing artifacts (procedures, suites, cases) on the other side, and techniques to consistently maintain these traceability links during software evolution.

Envisioned research directions for smarter testing techniques include:

  1. test case selection, that is aimed at identifying the test cases that are relevant to a given set of changes;
  2. test suite minimization, that is aimed at removing redundant test cases in order to reduce the number of tests to run;
  3. test case prioritization, that is aimed at determining an order for running test cases in such a way that faults are detected as early as possible in the process (i.e., testing the most vulnerable areas first).

Since software testing can only validate the system against anticipated faults, a second goal of the project is to investigate data-driven technology that supports handling of unanticipated faults or anomalies such as non-fatal performance degradation.

In 2017, we start investigating this line of work with a focus on anomaly detection and anomaly diagnosis. The initial idea is to investigate theĀ use of machine learning algorithms to learn from historical operational data what are the patterns or models of normal behaviour of the system, and the use of these models on run-time operational data to detect unanticipated anomalies. Next, we will investigate how we can augment this detection with mining of development data to support diagnosis and understanding of the root causes of the anomaly.


This project is aimed at enabling smarter V&V of evolving software systems by devising practical, scalable, and cost-effective, automated techniques based on the analysis of (trends and patterns in) current and historical data obtained from system development and execution. Sources include (1) changes made, and characteristics observed, in successive releases of the system, (2) operational data from system execution (such as traces and performance measures), and (3) operational data from system testing and continuous integration (such as test success and failure data).

Overall goals of the project are to:

  • devise a set of novel techniques for smarter V&V of evolving systems, initially aimed at enabling an increased detail in (regression) testing and later extended to techniques that help deal with unanticipated anomalies,
  • that will significantly improve the quality of industrial systems,
  • empirically evaluate the costs, effectiveness, and scalability of the proposed techniques using well-established methods such as controlled experiments, case studies, and surveys;

demonstrating applicability of the proposed techniques as proof of concept tools.