Handling flaky pytest tests
There’s often a deadline sitting between pragmatism and perfection
in a code base and during an exploration of Pythons pytest
extensions and plugins I found a couple of exemplary examples of
straddling that line. The two modules
Flaky, and the more subtly named,
pytest-rerunfailures
each help blur the lines a little by allowing you to rerun failing
tests and often take the “two out of three approach” to handling
troublesome tests.
Marking your tests as non-deterministic is simple in both implementations:
# Flaky
@flaky # By default retry a failing test once
## rerun the tests upto 3 times and pass if more than `min_passes` succeed
@flaky(max_runs=3, min_passes=2)
# pytest-rerunfailures
## rerun the tests upto 5 times if they fail
@pytest.mark.flaky(reruns=5)
Despite the similar problem domain there are a number of differences
between the two plugins that will help you decide which one is more
suited to your needs, reading both README
s before picking is heartily
recommended. Flaky for example lets you “best X out of Y” your tests and
has a class wide marker for those more problematic areas. pytest- rerunfailures
exposes a large part of its functionality via command
line options to pytest
and can help you explore and get the tests
passing at least once so you know where to focus your time and effort
rather than just assuming it’s all broken. Given the right kind of
code base I could imagine myself running an overnight:
$ pytest --reruns 10 --reruns-delay 300
To check if the tests every actually pass or if there is some non- deterministic behaviour at play.
I’m not sure if I’d ever be willing to use these in earnest in a module but I can see the potential in certain network related domains. Especially when you don’t have the time to pivot to something more stable like a stub or mock. I can also envision how I’d use them when adopting a new (to me) code base with low numbers of test, or just flaky ones. And then never have the time to return and clean them back up.