Friday, November 2, 2012

Single Responsibility, or It's OK To Duplicate Code

The Single Responsibility Principle doesn't mean that code should "do one thing." Nor does it mean that any given piece of functionality should exist in only one place. It means that code should have one, and only one, reason to change.

But what is a reason to change?

A reason to change can be any number of things... A new requirement, a clarification, an additional feature. It can come from any part of the business. To understand what a reason to change is, we need to understand what a responsibility is. Because the responsibility is what gives us the reason to change.

The responsibility addressed in the Single Responsibility Principle is the role in the ownership of the code to which that code is responsible. It's the owner of that piece of functionality within the business. It's the person who gives you the new requirement, issuing a change to the code. And, as stated by the principle, any given piece of code should have only one owner.
The responsibility is not what the code does, it's for whom the code does it.
Let's say, for example, you're building a handful of web applications to be used by a handful of different lines of business within a company. Each line of business has its own leadership, and within each one there are departments with fairly autonomous leadership, etc. Ultimately, there is a whole slew of stakeholders each claiming some measure of ownership throughout the code.

Now, there's a piece of functionality that's somewhat shared among several of these groups. The first instinct is to avoid duplication and increase code re-use by making this functionality once and having everybody use it, right? Well, sure, that makes sense. And it's a worthy goal. But at what cost do we sometimes pursue that goal?

Note that I said it's "somewhat" shared. There are differences. At first it seems the same for everyone, so you design it once and re-use it everywhere. You build the system around this re-use. You assume it in everything you do. But then something happens. One of the lines of business adds a requirement to this module.

Now what do you do? Let's look at some common options...

1. You put conditional checks in the code so that it behaves one way sometimes and another way other times. First of all, this sounds a lot like a violation of the Interface Segregation Principle as well. You're essentially attempting to create one catch-all interface to handle multiple use cases. Those conditionals will add up, and they will make support and debugging difficult. How many different things does that one module need to do? How many code paths does it need to have? How are you going to test all of those?

2. You tell the stakeholder that they can't make that requirement because "the system wasn't designed for it." I've seen a lot of developers convince stakeholders of this nonsense. They sure as hell can make that requirement, the developer is just too lazy or too committed to his own design to actually meet the business needs, which should be his priority in the first place. The stakeholder shouldn't buy into that excuse, not one bit.

3. You make the change. And, of course, the other stakeholders for the other roles to which this code is responsible are going to see that change and wonder why someone who doesn't own their functionality just changed it. And you're going to have some explaining to do, probably laced with more excuses.

The simple fact of the matter is that the Single Responsibility Principle was violated in this case. Even if you go with the first option and the other stakeholders are kept in the dark, you still changed their module. Their use cases still need to be re-tested. (Or are you just going to assure everybody that it works and there's no need to re-test it? I hope QA doesn't buy into that nonsense.)

Because of this violation, any change to that one piece of code is going to require re-testing and re-validating a lot of stuff because there are too many responsibilities in play. And every separate business role/entity who has a stake in that code has every right to demand to know why their module was changed to support someone else's requirement.

Now, this is coming off a little overly harsh, and I don't intend it as such. The developer's mistake might not have been in the code. Not until he started to make excuses for it, anyway.

Prior to the new requirement, everything was abstracted into a common module. Duplication was eliminated, all was well. It was the advent of the new requirement that changed this. If the developer had no reason to think that this functionality would ever change then there was no reason to duplicate it. But once that requirement came in, the developer had a choice. Refactor out the new responsibility into a separated concern, or throw a quick fix at it and make excuses.

How many times have we seen the latter? How many times have we done the latter?

I know what you're thinking... Won't this lead to duplication? Isn't duplication bad? Well, yes. Duplication in code is bad. But let's define duplication:
Duplication in code happens when two pieces of code do the same thing for the same reason.
That last part is the key thing to notice here. It's not entirely duplication if it's doing the same thing for a different reason. And that reason may be that it's a different responsibility. Again, we strive throughout our careers to eliminate duplicated code... but at what cost?

Often times duplication is orthogonal to coupling. Consider the above example. That module tightly coupled those lines of business. Those individual responsibilities, those separate concerns were irreconcilably joined together. Change one, the others change with it. Sure, we can hide that change behind conditionals and such, but that doesn't fix the problem. It only creates more problems.

How far are we willing to push the cause of removing all duplication? After all, looking through my code I sure do have a lot of loop constructs. I have a ton of these (with better variable names, of course):

foreach (var thing in things)

Is that duplication? Should I strive to eliminate all of that duplication to avoid having those same keywords and keystrokes over and over again throughout the code? Of course not, that's silly. It's taking the notion to an unnecessary extreme.

So, having established that there's an unnecessary extreme... And assuming you'll stipulate that the opposite end of the spectrum, not avoiding duplication at all, is also a bad thing... We are left with a spectrum. And as with any spectrum, there's a pretty wide range of possibilities. To lean to one side and say that all duplication must be avoided is to do so at the cost of tight coupling.
Duplication is orthogonal to coupling.
It's ultimately a matter of cost. What will it cost to support that code? To overly-trivialize the cost analysis...

If we duplicate the same responsibility across multiple modules (obviously undesirable but included here for completeness) then support costs for changes to that responsibility are O(n) where n is the number of times it's duplicated.

If we religiously collapse all duplication into single pieces of globally shared code, thereby increasing coupling, then support costs for changes to any one responsibility are O(n) where n is the number of responsibilities in that code.

If, when we encounter a second responsibility, we refactor out the code which is owned by that responsibility (duplicating some things, if we count duplication by keystrokes alone), then there is an up-front development cost that's O(1) for making the change. Then ongoing costs for any given change to that one responsibility are O(1) for making the change.

I'm no mathologist, but the one-time up-front cost sounds a lot more attractive than the unknown ongoing cost. Sure, as developers the second option to abstract everything is more appealing. It's a more interesting puzzle to solve. (Sidebar: I once heard a great quote on that... "With enough levels of abstraction you never need to solve the actual problem." -Java Architect) But solving interesting puzzles isn't why we're here. You can always solve that one on your own time. (And if, in doing so, you come up with something even better for the business then kudos.) We're here to write software that supports and enhances the business.

Ok, I know the math was an oversimplification. It's just there to illustrate the point. In the long run, separating responsibilities into their own modules reduces support costs. Even at the cost of duplication. After all, it's not really duplication if two pieces of code are doing the same thing but for different reasons.

The Single Responsibility Principle and the Interface Segregation Principle are more valuable to the codebase than one developer's desire to do something clever.