Oh, Look, That Does Work!

My team creates software to manage 401(k) retirement plans. Back in 2006, the U.S. Congress passed legislation allowing Roth contributions in 401(k) plans. Plan participants can elect to pay taxes on their retirement deferrals up front, rather than the usual tax-deferred 401(k) contributions. Then, when they retire and withdraw their savings, they don’t have to pay taxes on their withdrawals. This is called a “qualified” distribution.

There is a catch – if you make a Roth contribution, you have to leave it in your account for at least 5 years. If you withdraw it early, you have to pay taxes on whatever gains you made from interest and dividends on your investments. We tested the difference between qualified and unqualified distributions both with in-memory FitNesse tests that verified the algorithms in the code, and by manual testing by faking out the dates as needed. But there was no chance yet to have a qualified distribution in real life.

We have many tests for Roth withdrawals, from the unit level on up to the GUI level. One FitNesse test verifies the results when an employee withdraws money that includes funds from their Roth account. Since 2006 was the first year anyone could make Roth contributions, there was no possibility to make a “qualified” distribution until this year. The test set up a participant whose first Roth contribution was in 2006, and verified that the system withheld taxes for the gain portion when the participant withdrew the money.

At the start of 2011, this test started to fail in our continuous integration. Surprise – it’s 2011, the participant has now been contributing Roth funds for 5 years, thus now the distribution is qualified and should on longer have any tax withheld! The actual results of the test showed this difference in behavior.

Some people would say we shouldn’t have an end-to-end test like this, that it’s expensive to maintain. But I think it was cool to see this functionality actually work in production. It allowed me to give a heads-up to the CSRs that they would now see Roth distribution checks going out that didn’t have tax withheld. I changed the expected results, because before this, we weren’t able to have any end-to-end test for this case. It also shows how our tests are living documentation – as soon as we have a test failure, we have to figure out the reason, and either change the code (if it’s a bug) or change the test (if it’s correct behavior).

What’s your opinion? Do you like this sort of test, and this sort of surprise?

4 comments on “Oh, Look, That Does Work!

  1. Congrats that the feature works! Hooray 🙂 This is what I call “an inverse bug” 😉

    I’ve worked with pension systems a few years back and I’ve spent a good deal of time considering the kind of scenarios you describe.

    We thought a lot about running accelerated time on the whole system, but it was impossible for various technical reasons. So we set out a plan with a few predefined test customers in the system which we would keep “alive” and verify along the way. We wanted to give development a chance to react before “real” customers crossed the various time limits.

    I think that verifying a complex business scenario like this in test (automatic or not) is very important and that maintenance costs are well justified. This matters a lot to tax payers and law makers! If a problem in this would have slipped through, it would have caused quite a stir and possibly even given you some bad press.

    Thanks for sharing your actual project experiences, by the way! 🙂


  2. Hi Lisa

    I dont see anything wrong with an end to end test. The problem that I found with Agile is that people are good at building all the little blocks and checking they are ok, but then not putting them together to see if it fits. It should be like having a lego set and building one piece on top of another and connecting all together. Only once you put all components together do you deliver the required fiunction or process.

    I would ask the question ‘if it doesnt fit here, then where should it come?’

    The danger of course with not doing an e2e test is that if you fail to find the issue in testing, its then found later on by the users – this is more expensive and results in reduced user confidence in the process. I believe that a team is only as good as its most recent deployment, so even if all year long what you deliver is great, if the latest release is not, thats what the users will be looking at. Its unfair, but real life.

  3. I don’t think anyone argues with the need to do end-to-end testing, but some experts think they are too expensive to automate, and that robust automated regression tests at lower levels would suffice. Personally, though, I find properly designed automated end-to-end smoke tests give a good ROI.

    You bring up a good point. In agile, we’re always working on small increments and it’s easy to lose sight of the big picture. So I agree, it’s worth automating the end to end tests, if the automation is well-designed and maintainable.

  4. Hi Anders, sorry I didn’t approve your comment earlier, something is messed up in my WordPress blog. Thanks for the affirmation, I do think we are getting a reasonable return on the cost of this testing.

Leave a Reply

Your email address will not be published. Required fields are marked *


Recent Posts: