Re-running flaky tests in Gradle

Flaky

The world is not perfect, so are the tests, and one day you will get… a flaky test 😱

Flaky… what?

As per Nebojša Stričević:

An essential property of an automated test and the entire test suite is its determinism. This means that a test should always have the same result when the tested code doesn’t change. A test that fails randomly is commonly called a flaky test. Flaky tests hinder your development experience, slow down progress, hide real bugs and, in the end, cost money.

tl;dr: Flaky tests are non-deterministic tests. Such tests make you lose time, and time is money.

And, if you think you can always avoid them, just save this article for later ;)

P.S. you can’t.

Automatically re-running failed tests in Gradle

ℹ️ There are many ways of dealing with flaky tests with test framework’s rules/extensions/features.
I will not focus on them since here we’re talking about a generic solution for any Gradle build.

For instance we have the following test:

public class RerunTest {

    @Test
    public void flakyTest() throws Exception {
        File file = new File("build/test-rerun.txt");

        if (file.createNewFile()) {
            throw new IllegalStateException("Rerun me!");
        }

        file.delete();
    }
}

Obviosly, it will only pass in 50% of cases, because it depends on some state.

⚠️ WARNING!
Don’t write such tests :D This one is just for the demonstration purpose!

If we run it, we will get a failed test:

$ ./gradlew test

> Task :test FAILED

Gradle Test Executor 5 > com.example.RerunTest > flakyTest STARTED

Gradle Test Executor 5 > com.example.RerunTest > flakyTest FAILED
    java.lang.IllegalStateException: Rerun me!
        at com.example.RerunTest.flakyTest(RerunTest.java:17)

1 test completed, 1 failed

Our CI system changes to red status, we spent 30 minutes waiting for test results, our change is working but some unrelated test is failing, we’re getting mad…

We run it again - all good:

$ ./gradlew test

> Task :test

Gradle Test Executor 6 > com.example.RerunTest > flakyTest STARTED

Gradle Test Executor 6 > com.example.RerunTest > flakyTest PASSED

BUILD SUCCESSFUL in 1s
2 actionable tasks: 1 executed, 1 up-to-date

But… that’s another 30 minutes, for example. And a manual action to restart the CI job.

What if instead, if some tests are failing, we will run another test job, right after the first one, and attempt to get a green status this time?

Sounds like a plan!

Gradle FTW!

First, we need to process every test task in our build:

tasks.withType(Test) {

Now, we will register another task that will strictly follow the original one:

    def rerunTask = tasks.register("${name}Rerun", Test) {
        // Enabled only when there are failures
        enabled = false
        failFast = true // ¯\_(ツ)_/¯
        outputs.upToDateWhen { false }
    }

    // Make it always run after the original task
    finalizedBy(rerunTask)

It is disabled by default because otherwise it will rerun all tests if the inclusion list is empty. We will enable it once we add something to the list.
Note that we’re using failFast mode, because if a test fails second time in a row, it definitely requires some attention ;)

But, since we’re registering a new task… Don’t we have a recursion? Sure we do! Let’s ignore such tasks:

tasks.withType(Test) {
    if (name.endsWith("Rerun")) {
        return
    }

Now we need to track every failed test and add include it in Rerun’s inclusion list:

    afterTest { desc, result ->
        if (TestResult.ResultType.FAILURE == result.resultType) {
            rerunTask.configure {
                enabled = true
                filter.includeTestsMatching("${desc.className}.${desc.name}")
            }
        }
    }

Here, for each test result, we check if it failed and add it to the filters of the rerun one.
We also unconditionally enable the rerun task here.

Aaaand… the last tiny bit that makes the whole thing shine: ignoring the failures of the original task!

    ignoreFailures = true

(Full snippet is available here, I may even make a plugin… later 😅)

If we run the build again, we get the successful result despite an errored test during the main run:

$ ./gradlew test
> Task :clean
> Task :compileJava NO-SOURCE
> Task :processResources NO-SOURCE
> Task :classes UP-TO-DATE
> Task :compileTestJava
> Task :processTestResources NO-SOURCE
> Task :testClasses

> Task :test

Gradle Test Executor 14 > com.example.RerunTest > flakyTest STARTED

Gradle Test Executor 14 > com.example.RerunTest > flakyTest FAILED
    java.lang.IllegalStateException: Rerun me!
        at com.example.RerunTest.flakyTest(RerunTest.java:17)

1 test completed, 1 failed

> Task :testRerun

Gradle Test Executor 17 > com.example.RerunTest > flakyTest STARTED

Gradle Test Executor 17 > com.example.RerunTest > flakyTest PASSED

BUILD SUCCESSFUL in 1s
4 actionable tasks: 4 executed

One “That’s a feature, not a bug!” outcome of using two test tasks is that when you aggregate the test reports, you will see some failures in the main run, but not on rerun - free flaky tests reports for everyone 😅

comments powered by Disqus