Estimating in Software Development
How I limit the uncertainty inherent in software estimation
I remember the first time I was asked to give an estimate to someone. A recruiter wanted me to do some work related to his database.
“I’m not sure how long it’ll take,” I said.
His brow furrowed. “Why not?”
“I haven’t dealt with this particular database vendor before, and I’m not sure what the table structure looks like for your particular application.” I tried to be amiable in my response, but I’m sure I looked like I was sweating a little.
I was just out of college. Up until this point, most of my programming experience had been academic or for fun. But there was money on the line here.
The recruiter took a moment to ponder these things. “Wait…so you’re going to charge me to learn these things? That’s not acceptable! XYZ told me it was going to be $#,###. Period.”
Now I was really taken aback. How could someone give such surety about the cost of something?
Needless to say, I didn’t get the project. And I’m happy I didn’t.
A lot of time has gone by since then. Now I give estimates every day of the week without blinking, and I say it with the same amount of surety.
How can I do this? Am I prescient? No, not at all. All I’ve learned to do predicate an estimate on a set of assumptions.
Where I work, we have worked hard to standardize our stack. So I can assume that all the back-end developers know Python, Django, and Postgres. I can further assume that all the front-end developers know Django template syntax, Django CMS, Bootstrap3 and LESS.
Our stack is further standardized with specific services we strongly prefer, but I won’t get into all of that.
Where I work, we have been around for a long time. We have lots of data about historical projects. And everyone here clocks their time, even if the project is internal. So what I do is very simple.
Time Total. As projects close, I go in and check to see how much time was clocked against a project. I don’t care how much was clocked against a specific task. I care how much time was clocked against the entire project.
Component Count. Then I go to source control, and I pick something that’s easily countable. Since Django kind of assumes that urls are unique, I count unique url endpoints.
Rules of Thumb
Now that I have the data, I can do this simple calculation.
Project Time Total / Component Count = Avg for Component for that Project
This is a rule of thumb. You could potentially have several of them.
Once you have a rule of thumb, you can use it for estimating future projects based on a complete architecture.
The final step is when asked to make an estimate. You must gather the details and start coming up with all the user interfaces. In some cases, you may have few to none. But the UIs generally drive expectations on what the underlying data structure should look like.
You have you data. Now estimate based on real data against a agreed on architecture. This should go relatively quickly at this point.
And you’re done. Get some coffee.
But That’s Not Realistic!!!
The most common objection is that this technique is only useful for repeatable tasks, not for brand new, green field stuff.
You’re correct! But you’re also not correct!
If you’re dealing with lots of unknowns, then yes, I agree with you. Usually, there are some tasks that you can have some level of expectation of how long those take. The more rote a task is, the better estimate I can give.
In fact, usually, there’s a big huge chunk of relatively known stuff. Give those things the numbers and continue to attack the problem until you’re down to the truly unknown.
Common Mistakes of Using Estimates
Estimates are not commitments. They are likelihoods. They get you in the ballpark, but often are wrong—very wrong. And not in the direction you’d like for them to be. In particular, I strongly push back on estimates to be lined up one after another as if each one is a certainty and project that we’ll hit some target one year out or further.
Estimates are very rough stand-ins for price. The more time over time a project takes, the less likely the specific sum value will be correct—because there is all this unknown still attached to it. Again, they get you in the ballpark, but often are wrong—very wrong.
So if they’re so wrong, why bother? Generally it’s because they won’t be orders of magnitude wrong. If you aggregate estimate is 90 hours, then it’s unlikely (not impossible) that the cost won’t be 1000 hours. This is useful to people in many a high-level decision on where to place effort at for a large body of work.