Checklists have seen a recent resurgence in popularity.1 If you want another argument to use them, this isn’t that article. I’m just going to give a few practical examples of when you might want them as a software engineer, tech lead, or other management.
Useful checklists
Checklists may be useful if you (1) have semi-routine tasks that (2) are not easily automated and (3) when missed can cause significant problems.
Here’s a non-exhaustive list of such tasks:
bug triaging
submitting for code review
reviewing code
project breakdowns
requirements gathering
analyzing poor performance
troubleshooting known issues
various infrastructure tasks that can’t do IaC
Anatomy of a checklist
Now let’s talk about what you put on them. Good checklists contain:
Items that if you miss, will kill you2
Items that you can say a definite yes or no to
Items with very terse wording. Maybe 10 words at max
Small sets of items, no more than 5 to 7 per set, that go together in someway
A date last revised
That’s it. Those are the qualifications.
Failure modes for checklists
The book doesn’t go into this specifically, but I’m going to because I think it’s important.
When they get out of date. Regardless I’d guess it’s one of the most important pieces that gets overlooked is the date last revised. People don’t want feel like revisiting it because the process changes too much or because they don’t want another thing to do on a regular basis. Both feelings are understandable, but honestly, those are excuses, not reasons.
When they’re for everything in utter detail. Checklists shouldn’t be how-to guides or tutorials. They’re meant for people who already know what they’re doing. These lists need to cover the stuff that kills you and no more than that.3
When they begin to replace a well-written bash script. Remember we’re doing this because there are operations in a specific order that for whatever reason we can’t automate. If you can automate, just do that instead.
Final thoughts
Checklists are great! Do more of them, and when you can, use that checklist as a precursor to writing those scripts to do the tasks for you.
Likely due to The Checklist Manifesto, and I have to say I love the message of the book and a lot of the info in it, but the book focuses far too much trying to convince it’s a good idea and too little actually giving practical advice.
I say kill, because people bloat lists with stuff they think is important, but most people don’t write software that could literally kill someone. So let’s just say costs a lot of money or time for someone.
If there’s a lot of things that can kill you, then maybe you need different checklists for each kind of a threat and an obvious index to look them up. This is effectively the difference in goals of a runbook vs a playbook, which PagerDuty has an excellent article on.