Wednesday, September 14, 2011

Test Everything. Every Time.

My client recently asked me an interesting question.  We're currently in the process of doing various bug fixes and taking care of a lot of "low hanging fruit" while the business prepares for the larger projects coming our way.  So we're about to push out a maintenance release which is about 95% composed of support request tickets.  And the client asked me, "Which parts of the system should we test?"

They wrapped it up in a lot more jargon than that, tossing around terms like Risk-based Testing and the like.  And they spent some time explaining to me why this is important.  I get it, I see what they're saying.  But that doesn't change my answer to the actual question.
Which parts of the system should we test?
All of them.
No, it's not a cop-out to the question.  It's my advice, and you can take it or leave it.  The decision is yours.  Now, the client explained to me that testing everything is impossible.  It's a dream that can never be realized.  To even attempt it would be cost prohibitive.
Why would it be cost prohibitive?
Because testing the software is a slow manual process.  We only have one tester and there are only so many hours in the day.  And that tester isn't familiar with every nook and cranny in the system, he isn't going to be able to test everything.  It's unreasonable.

Now we've touched upon the source of the problem.  There are two key sentence fragments here which jump out at me and which identify the root cause of what's really wrong with the system, regardless of what changes or bug fixes we make:
  • "slow manual process"
  • "isn't familiar with every nook and cranny in the system"
Let's start with the slow manual process.  Why is it a slow manual process?  Why aren't there automated tests?  Unit tests, integration tests, etc.  Well, after working in this code for a few months, let me assure you that these things are indeed not possible here.  The system was designed to be tested in its entirety by a live person, nothing more.  So it occurs to me that someone at some point in the history of the company, someone who was in a decision-making position, actively decided that all testing should be a slow manual process.

Nobody here made this decision, and those who were involved in the past probably didn't even know they were making this decision.  But it was made nonetheless.  Testing is a slow manual process because they, as a company (not any one particular individual, I hope), decided that it should be a slow and manual process.  (There's an old saying, "Never attribute to malice what can be explained by incompetence."  I'm honestly not sure if either of those root causes apply here.  I wasn't around when the system was developed, so I don't know what the story really was.  But in this particular case, the root cause is irrelevant.  The net result is the same.)

This decision that the company made doesn't change my recommendation.  It doesn't change what my career has taught me to be a best practice.  It doesn't change the fact that one should fully test everything one does in one's software.  The only thing it changes is the cost of that testing to the business.  And cost isn't my department.  I'm just telling them what they should do, the fact that they chose (actively or passively) to do it in a prohibitively expensive way is another matter entirely.

In the past, had they sought my advice (or that of any consultant from my employer), the answer would have been the same.  But in the past we may have been able to steer the design of the software to allow for more cost-effective testing.  We'd have been happy to provide it.  But in the past a decision was made by the business not to seek the advice of industry professionals.  I can't change the past.  But I won't let this one company's past change my mind about recommendations and best practices.  I'm there as a consultant to bring these practices to the business.  Not to change my practices to fit decisions the business made about software in my absence.  My advice still stands.

Then there was that second troubling statement, whereby the tester isn't familiar with the system.  That one frightens me even more, honestly.  I can get the fact that one doesn't have automated tests.  I can get the fact that QA and QC aren't in the budget.  It's not what I recommend, but it's something I can at least understand.  But not even knowing what one's software does?  How can one even begin to justify that?

Isn't it all documented somewhere?  Aren't there training materials for the users?  Requirements for the software?  Business designs?  Technical designs?  Even just an intuitive interface that purports to do what the business actually does?

This goes back to something I've been recommending since the day I got there.  You need to model your domain.  We can argue all day about what that means in the code and how to design the applications to make use of this information.  But for the business this concept is of paramount importance.  If you want your software to do what it needs to do, you need to define what it needs to do.  Anything which deviates from that definition is a defect in the software.  That definition is the specification for the software.  It's the training manual.  It's a description of what the business does.  You should know what your business does.

If the tester doesn't know what the software is supposed to be doing, who does?  Is there even agreement across the enterprise of what the software is supposed to do?  For any given piece of functionality, how does one know if it's doing what it should be doing if what it should be doing is undefined?  One employee thinks it should work one way, another employee thinks it should work another way.  Who's correct?

Don't ask the developer, because I'm just going to tell you what I was told to implement and how I implemented it.  To me, it's all correct (save for the occasional actual bug).  Anything that physically works, it works as designed.  In a system where the behavior isn't defined, there are by definition no defects.  After all, a defect is where the system isn't doing what it's supposed to be doing.  But if nobody knows what it's supposed to be doing, then that condition can never be met.

This leads us to another decision that was made by the business at some point in the past.  Someone who was in a decision-making position decided that the behavior of the system should be defined by the developer(s).  The behavior of the software, and the validation thereof, was entirely defined by and known only to someone who isn't there anymore.  Intentionally or not, this was by design.  Again, this doesn't change my recommendations today.  It just makes it more difficult for them to follow my recommendations.

This is all well and good and has made for a nice little rant, but where does this leave us?  How can we make this constructive?  Simple.  Learn from the past.  The business is growing, the operational costs are growing, everything is going to get more expensive as time goes on because the impact to the business will have higher and higher dollar values.  None of us can change how this software came to be.  None of us can change the decisions that were made in the past.  But we can make decisions right now.

Model the domain, build tests against that model.  Then writing and maintaining the actual software becomes almost trivial.

2 comments:

  1. Yeah I agree. I feel like with out proper automated testing there is no way you can have long term continuous productivity. I think that without a solid continuous integration and continuous testing (automated testing for the most of it) in place you'll be slowly going down hill and chasing your tail. Worse you may not realize it if you are not properly tracking metrics, which is very hard.

    Of course writing and maintain tests is a skill and a discipline that has to be developed as well. I'm hoping to further mine, but in my experience very few people have spent anytime on it. All to easy for someone to jump straight into code, click through some things, and pass it along. Or worse someone looks down on the practice because it takes extra coding time to write and maintain.

    Listen to Uncle Bob, you don't have the time to not test.

    ReplyDelete
  2. I'll readily admit that I have not developed those skills. But as a consultant it's just as important that I find a resource to add to the team who has. After all, writing the tests can happen outside of writing the code, so it's a task that can be offloaded to a separate resource. (Often at less cost to the client, since for us that resource will likely be at a discounted rate from our dev farm, which isn't as sinister as it sounds.)

    I need to read more from Uncle Bob. I'll probably finally buy Clean Code today, or whatever it's called. He's definitely right, and I quote him often on the subject. The client may complain that writing automated tests costs so much more than just doing it manually. But how much more? 10x? 20x? Even if it's that high, can they honestly say that they're only ever going to need to test their software 20 times in its entire life? There will only be 20 releases to production, no more? Doubtful.

    ReplyDelete