My team creates software to manage 401(k) retirement plans. Back in 2006, the U.S. Congress passed legislation allowing Roth contributions in 401(k) plans. Plan participants can elect to pay taxes on their retirement deferrals up front, rather than the usual tax-deferred 401(k) contributions. Then, when they retire and withdraw their savings, they don’t have to pay taxes on their withdrawals. This is called a “qualified” distribution.
There is a catch – if you make a Roth contribution, you have to leave it in your account for at least 5 years. If you withdraw it early, you have to pay taxes on whatever gains you made from interest and dividends on your investments. We tested the difference between qualified and unqualified distributions both with in-memory FitNesse tests that verified the algorithms in the code, and by manual testing by faking out the dates as needed. But there was no chance yet to have a qualified distribution in real life.
We have many tests for Roth withdrawals, from the unit level on up to the GUI level. One FitNesse test verifies the results when an employee withdraws money that includes funds from their Roth account. Since 2006 was the first year anyone could make Roth contributions, there was no possibility to make a “qualified” distribution until this year. The test set up a participant whose first Roth contribution was in 2006, and verified that the system withheld taxes for the gain portion when the participant withdrew the money.
At the start of 2011, this test started to fail in our continuous integration. Surprise – it’s 2011, the participant has now been contributing Roth funds for 5 years, thus now the distribution is qualified and should on longer have any tax withheld! The actual results of the test showed this difference in behavior.
Some people would say we shouldn’t have an end-to-end test like this, that it’s expensive to maintain. But I think it was cool to see this functionality actually work in production. It allowed me to give a heads-up to the CSRs that they would now see Roth distribution checks going out that didn’t have tax withheld. I changed the expected results, because before this, we weren’t able to have any end-to-end test for this case. It also shows how our tests are living documentation – as soon as we have a test failure, we have to figure out the reason, and either change the code (if it’s a bug) or change the test (if it’s correct behavior).
What’s your opinion? Do you like this sort of test, and this sort of surprise?