Re-running flaky tests in Gradle
The world is not perfect, so are the tests, and one day you will get… a flaky test 😱
Flaky… what?
An essential property of an automated test and the entire test suite is its determinism. This means that a test should always have the same result when the tested code doesn’t change. A test that fails randomly is commonly called a flaky test. Flaky tests hinder your development experience, slow down progress, hide real bugs and, in the end, cost money.
tl;dr: Flaky tests are non-deterministic tests. Such tests make you lose time, and time is money.
And, if you think you can always avoid them, just save this article for later ;)
P.S. you can’t.
Automatically re-running failed tests in Gradle
ℹ️ There are many ways of dealing with flaky tests with test framework’s rules/extensions/features.
I will not focus on them since here we’re talking about a generic solution for any Gradle build.
For instance we have the following test:
public class RerunTest {
@Test
public void flakyTest() throws Exception {
File file = new File("build/test-rerun.txt");
if (file.createNewFile()) {
throw new IllegalStateException("Rerun me!");
}
file.delete();
}
}
Obviosly, it will only pass in 50% of cases, because it depends on some state.
⚠️ WARNING!
Don’t write such tests :D This one is just for the demonstration purpose!
If we run it, we will get a failed test:
$ ./gradlew test
> Task :test FAILED
Gradle Test Executor 5 > com.example.RerunTest > flakyTest STARTED
Gradle Test Executor 5 > com.example.RerunTest > flakyTest FAILED
java.lang.IllegalStateException: Rerun me!
at com.example.RerunTest.flakyTest(RerunTest.java:17)
1 test completed, 1 failed
Our CI system changes to red status, we spent 30 minutes waiting for test results, our change is working but some unrelated test is failing, we’re getting mad…
We run it again - all good:
$ ./gradlew test
> Task :test
Gradle Test Executor 6 > com.example.RerunTest > flakyTest STARTED
Gradle Test Executor 6 > com.example.RerunTest > flakyTest PASSED
BUILD SUCCESSFUL in 1s
2 actionable tasks: 1 executed, 1 up-to-date
But… that’s another 30 minutes, for example. And a manual action to restart the CI job.
What if instead, if some tests are failing, we will run another test job, right after the first one, and attempt to get a green status this time?
Sounds like a plan!
Gradle FTW!
First, we need to process every test task in our build:
tasks.withType(Test) {
Now, we will register another task that will strictly follow the original one:
def rerunTask = tasks.register("${name}Rerun", Test) {
// Enabled only when there are failures
enabled = false
failFast = true // ¯\_(ツ)_/¯
outputs.upToDateWhen { false }
}
// Make it always run after the original task
finalizedBy(rerunTask)
It is disabled by default because otherwise it will rerun all tests if the inclusion list is empty.
We will enable it once we add something to the list.
Note that we’re using failFast
mode, because if a test fails second time in a row, it definitely requires some attention ;)
But, since we’re registering a new task… Don’t we have a recursion? Sure we do! Let’s ignore such tasks:
tasks.withType(Test) {
if (name.endsWith("Rerun")) {
return
}
Now we need to track every failed test and add include it in Rerun’s inclusion list:
afterTest { desc, result ->
if (TestResult.ResultType.FAILURE == result.resultType) {
rerunTask.configure {
enabled = true
filter.includeTestsMatching("${desc.className}.${desc.name}")
}
}
}
Here, for each test result, we check if it failed and add it to the filters of the rerun one.
We also unconditionally enable the rerun task here.
Aaaand… the last tiny bit that makes the whole thing shine: ignoring the failures of the original task!
ignoreFailures = true
(Full snippet is available here, I may even make a plugin… later 😅)
If we run the build again, we get the successful result despite an errored test during the main run:
$ ./gradlew test
> Task :clean
> Task :compileJava NO-SOURCE
> Task :processResources NO-SOURCE
> Task :classes UP-TO-DATE
> Task :compileTestJava
> Task :processTestResources NO-SOURCE
> Task :testClasses
> Task :test
Gradle Test Executor 14 > com.example.RerunTest > flakyTest STARTED
Gradle Test Executor 14 > com.example.RerunTest > flakyTest FAILED
java.lang.IllegalStateException: Rerun me!
at com.example.RerunTest.flakyTest(RerunTest.java:17)
1 test completed, 1 failed
> Task :testRerun
Gradle Test Executor 17 > com.example.RerunTest > flakyTest STARTED
Gradle Test Executor 17 > com.example.RerunTest > flakyTest PASSED
BUILD SUCCESSFUL in 1s
4 actionable tasks: 4 executed
One “That’s a feature, not a bug!” outcome of using two test tasks is that when you aggregate the test reports, you will see some failures in the main run, but not on rerun - free flaky tests reports for everyone 😅