A problem needs solving

This is inspired by a problem i was given at work. At some point in the last year, a suite of tests was accidentally disabled. Over 100 tests were dormant on CI for over 1000 commits, and surprise surprise, when it was noticed, the tests no longer pass! Before i started to tackle the actual failures, I wanted to identify the nature of the failure. Was it one commit that broke 96 of the tests, or 96 commits, each doing damage to just one test? In order to find out I had to search efficiently. 1000 commits * 96 tests would equate to 96000 tests, which could take months just to run and deliver their results!

A solution to the problem

Git has a function known as bisecting, which allows you to efficiently search a large body of commits to find a regression. At a high level, it works by performing a binary search over a given commit range, allowing you to search n commits in just ceil(log2(n)) tests. if applied to the above problem, it yields just 960 tests!

In order to explain this better, I constructed a toy example at my Github.

Upon checking out the repo, if you run pytest, you’ll note that the test suite test_helper.py fails, and by inspecting helper.py, you’ll note that the logic for finding even numbers is inverted. If you check back a few commits, you’ll see this function has been broken for some time. In fact, I deliberately broke this function sometime in the previous 10,000 commits. Ordinarily, CI should catch this type of regression, but mistakes happen, and it’s feasible that this test was never checked for one reason or another. Our challenge is to figure out where this regression was introduced into the codebase.

If you inspect the commit 9b1d44648a8e14504cdd6b4256f17348582566db, the first commit in the repo, you’ll see that I added a simple python module helper.py, and a simple test suite test_helper.py. if you run pytest with that commit, you’ll see that the tests all pass.