Laser-Coder
Circular Reasoning in Unit Tests
By Jason Lenthe | March 20, 2025

Imagine you're reviewing someone's code that implements a function in Python that calculates a half birthday:


def half_birthday(birthday):
  return birthday + timedelta(days=365 / 2)

Then you see a unit test for this function:


class TestHalfBirthday(TestCase):
  def test_half_birthday(self):
    dt = datetime(2018, 3, 10)
    self.assertEqual(half_birthday(dt), dt + timedelta(days=365 / 2))

There is something odd about this unit test. The developer seems to have copied the core expression of the half_birthday()function and put it into the unit test itself. Is this valid? Should you do this?

I will, in this article, argue that you should not.

Going Around in Circles

Each unit test is a small science experiment. It has a hypothesis, an experimental procedure, and analysis of the results. The hypothesis is that the code under test exhibits a particular behavior. The experimental procedure is the sequence of steps that set up and execute the code under test. The analysis of the results are the assertions on the output that determine whether the hypothesis is confirmed or not and whether the test passes or fails.

Now imagine someone trying to make the following scientific argument:

Newton's law of gravitation is correct because I used the theory to make some predicted results and then I ran some experiments to produce actual results in the same scenarios. The predicted and actual results match.

This is sound reasoning. Now consider the following scientific argument:

Newton’s law of gravitation is correct because, while at the library, I used the theory make some predicted results and then, while at home, I used the theory to make predicted results in the same scenarios. The two sets of results match.

This argument is quite fallacious. The building that you're in when you make the predicted results has no bearing on the results. It is essentially comparing the theory to itself. The argument is circular.

The scientific argument behind our questionable unit test is similar. It says our code expression that determines a half-birthday produces the same result in a unit test code file as it does in the original source file. The file that the expression lives in has no bearing on the results. We're just comparing the code under test to itself. It is a circular argument.

If you believe, as I do, that science is our best tool for understanding how things actually work, then circular reasoning isn't just a violation of theoretical purity—it can mask real problems and create a false sense of security that the code is correct when it isn't.

A Proper Unit Test

The real way to do this is to determine the half-birthday value by hand. It may seem a little slow and tedious, but this is how development often is—you've got to roll up your sleeves and get into the calendar. Unless you have a trusted external reference to verify against, manually determining the correct result is the only way to ensure that our code produces the right value.

Besides, it's really not that bad. You pick a date, count how many days remain in that month. Then start adding the number of days for each of the succeeding months until you get within a month of 182 days. Then you count days until you hit 182 and that's the half-birthday.

Now we can write a proper unit test:


class TestHalfBirthday(TestCase):
  def test_half_birthday(self):
    self.assertEqual(
      half_birthday(datetime(2025, 3, 10)),
      datetime(2025, 9, 8)
    )

We can also see now that our original code fails the test! It turns out that dividing 365 by 2 leads to a subtle bug. An extra half day gets added. We started with a datetime object having the hour, minute, and second default to 0 thereby representing the start of the day. Then we added 365 / 2 days to it. Since 365 is an odd number, we added an extra half day. Depending on the application this small difference could be significant. If we're writing code to enforce state law for driver's licenses allowed at age 16 and a half , then we will likely have angry teenagers complaining that they've been turned down because they came for their driver's test before noon.

So far, we've focused on getting a single happy path test case correct. A more robust test suite would also explore edge cases—such as handling months with different numbers of days, accounting for leap years, and ensuring the function works correctly across the full range of years, centuries, or even millennia it's expected to operate over.

Why the Trap of Circularity Exists

Developers fall into the trap of circular reasoning because they're attempting to minimize the amount of hard coded data. This reflects the old and still relevant maxim that says if you need to hard code something hard code it in only one place. This idea, along with other things developers are told to do such as have a certain level of code coverage and complete your task by a certain date in conjunction with the lack of any direction to ensure each unit tests is a sound scientific experiment, leads to some unit tests being useless for their purpose of checking that results and behavior are correct.

Conclusions

Effective unit testing goes beyond a green bar at a prescribed level of code coverage. It requires structuring tests as sound scientific arguments for correctness. A unit test must be devoid of circular reasoning lest it be a useless self-licking ice cream cone. We need to compare the results of the code under test to independently validated results, backed by a solid argument for their validity. Only then can we have an effective test.

Acknowlegements