How to Sanely Test Complex Systems

And examples in the wild of how to use them effectively

Robert Roskam

Jun 16, 2023

black and gray engine part — Photo by Michal Matlon on Unsplash

"We don't have tests for that, because it's too complicated."

If you've found yourself saying that, then these two concepts are for you. They are simply: test seeds and test clocks.

These are first-class features of your system, not something you can buy or install or tack on afterwards.

Seeds

Test seeds (aka static test objects) are data objects that are unchanging in your system. They’re data objections that you do not ship to production. Any interactions with them give the same answer every time without fail. They’re designed on purpose to not be changeable so that you can safely have starting points to run other tests from them.

An example may be helpful.

Let’s say you’re making some kind of Shopify competitor. It would be great if you have to different users that are unchanging, and always had the same amount of money regardless of how much you charge them. Let’s make one with a balance and another with zero.

Here they are:

GET /users/42
{ user_id: 42, balance: 10, ... }

GET /users/66
{ user_id: 66, balance: 0, ... }

Now let’s test out charging users. If you want to have different logic for "Insufficient Funds" vs “Normal” spend, you'd need to go out and max out a credit card.

GET /users/42
{ user_id: 42, balance: 10, ... }


POST /charge/
{amount: 1, used_id: 42}

Response 201
{id: 111}

GET /users/42
{ user_id: 42, balance: 10, ... }

Note that the balance is not decreased. You can repeat this operation over and over again infinitely, and it’s meant to demonstrate that the purchase flow works.

Now let’s use the other one.

GET /users/66
{ user_id: 66, balance: 0, ... }

POST /charge/
{amount: 1, used_id: 66}

Response 400
{"error": "insufficient funds"}

GET /users/66
{ user_id: 66, balance: 0, ... }

In a normal system under test, like in production, you’d have to make multiple steps to assert this behavior. However, you get it for free by having stable test seeds.

If you didn’t have this feature for your test environments, the exercise of these flow—creating the user, adding a balance or zeroing out a balance—is potentially expensive enough that you’d almost never do it. And even if you tested the pieces separately, you’d never be 100% certain that the system is working correctly, because of those pesky connection bits potentially being wrong.

Fortunately we have reasonable credit card gateway vendors like Stripe that provide test cards with specific dispositions, such as "Insufficient funds decline".

When you make a test seed, they need some specific attributes to be useful:

single purpose
memorable
stable

Test seeds with more than one purpose are prone to progressiveness feature creep and potentially degrading the other two important properties. For example, you can likely still recall above my example above that I used two users, and you may not remember everything about them, but you remember the variation was on the balance. It wasn’t on 10 different traits. It was only balance. That’s having a single purpose.

They need to be memorable with simple identifiers, even if they are long. Again going back to my example, there are just two, but you likely remember them: 42 and 66.

Finally, people need to be very stable so people can trust them: they need to be able to live in docs, and other places for long periods of time and have organizationally well understood behavior, so that people don't feel like dropping back to manual tests or smoke tests.

Some other common examples of test seeds:

Twilio has a magic number
Plaid has several different test_credentials to simulate various things

Clocks

This concept is relatively straight forward: the ability to freeze time or advance time on command in a system in a user session or an API request.

Going back to the gift card system concept. Let’s talk about a basic behavior that we want to test: what happens when we go past a special date?

In the case below, after a certain date, a gift card will expire. So let’s make a clock that we can attach to a gift card creation endpoint and then fast forward time.

POST /test-clock/
{ name: "1st of the Month", frozenTime: "2023-08-01" }

Response 201
{id: 111}


POST /gift-card/
{ id: 77, ..., expiresAt: "2024-07-31", testClockID: 111}


POST /test-clock/111/advance/
{ advanceTo: "2024-08-01"}


GET /gift-card/77
{ id: 77, ..., status: 'expired'}

You’ll note that we have to explicitly attach the clock in the creation. This is how I chose to implement this particular approach.

You don’t have to attach it to a very specific data object. You can have it apply system wide if you like, or have it attach to user or accounts contexts. It simply depends on what you need.

As another example, one way I’ve done this in the past for UIs is to have special area to go to allow individual users to set the time for the system for them.

This solution is especially good for handling problems like "we can only test this on the 1st of the month” or “ we need to wait 24 hours to see what happens in the system”. If you can just do those on command, you can iterate faster on your solution.

For APIs, Stripe has possibly the most comprehensive solution to this problem I've seen available to developers to play with time for their tests. Plaid is another vendor with several options.

Conclusion

They do take work, and you must treat them like first-classes features. They are features to make your own lives easier in engineering, QA, and Product.

You can’t get to the end of build and add these things usually. You can’t get a vendor to install them for you. You need to have thought of them from the beginning and accommodate them in your approach.

So plan for it, and you too can have sanity even in the most complex systems.

Robert Roskam's Newsletter

Discussion about this post