Wednesday, December 3, 2014

A Case Against Comments

Want to hear something opinionated? Comments are a code smell. There, I said it. And I don't apologize for it.

All comments? Well, maybe not necessarily. I'll concede that there are exceptions to the rule. Header comments which drive intellisense can be useful, for example. Though, seriously, they don't need to be on everything. I'll also concede that once in a while there's an exception to the rule where the purpose behind a particular implementation could require a little more explanation. Comments like, "We use the magic value 3 here because Vendor X's product requires that. Don't let this magic value leak from this dependency implementation." for example.

But by and large nearly all of the comments I see in real-world code are a code smell. Why? Because they don't add value. They take up space (not as a measure of bytes in the file, but as a measure of time spent understanding and supporting the code), they clutter the code, and they offer no information that the code shouldn't already have.

I can already hear the responses from every manager, owner, alleged developer, etc... "But without comments how will we know what the code is doing?" As usual, someone else has said it better than I can:
That's the gist of it, really. If a developer on the team isn't able to convey the business logic in a structured and meaningful way, what good are that developer's comments going to be? Does that developer even understand what he or she is writing? If not, aren't those comments just going to add more confusion and misinformation?

Robert Martin has been known to say, "Every comment is an apology." That apology is basically the developer saying that he or she wasn't able to express the meaning in the code, so he or she gave up and just wrote the meaning in comments instead. It's just that... giving up.

My own take on it is, "Every comment is a lie waiting to happen." That is, either the comment is saying something that the code isn't saying (so it's lying) or the code is going to change at some point and invalidate what the comment is saying (so it will be lying).

(And please don't say that we can simply enact a policy whereby developers must update the comments when they update the code. And those updates must be peer reviewed and documented and all that nonsense. Policies like that demonstrate only one thing, that the people enacting those policies have no actual meaningful experience leading a team of software developers.)

Think of it this way...
When you're reading code with lots of comments, do both the code and the comments unambiguously tell the same story?
If not, which one tells the correct story? Which one should tell the correct story? And why should the incorrect story also be told?

If they do both unambiguously tell the same story, then why does that story need to be told twice? Doesn't that violate DRY? What happens when one of their stories changes? Will the other change in the exact same way?

The story need only be told once. Tell it in the code.

Thursday, September 18, 2014

Being Right Isn't Enough

As engineers we are often faced with a very common problem. How do we convince "business people" that we're right? It seems we're always at odds with each other, doesn't it? Us vs. them? Any avid reader of Dilbert will tell you that we engineers know what we’re talking about and, in fact, that’s why they hired us. So why won't these "business people" listen to us?

I'm going to let you in on a little secret… They do listen to us. They simply disagree with the position as we've presented it. Sure, all our facts may be correct. Sure, our opinions may be valid. But the conclusion, the call to action, doesn’t resonate. It's not that "they" are incompetent or malicious, it's simply that "we" (which includes both "us" and "them") haven't reached a mutually agreeable course of action. (Caveat: Ok, I'll concede that there are some people out there in positions of authority who are incompetent or malicious or who simply don't want to listen to you. There's not much you can do about those ones. The only advice I can offer in that case is to take your leave of the situation. Create an exit strategy, enact that strategy, and go find something else that doesn't endlessly frustrate you.)

So, if they're listening to us, why aren't they agreeing with us? How can we impress upon them the need to agree with us? Well, that depends on that call to action and where "they" see that action taking the team. If you're not familiar with Stephen Covey's Time Management Grid, you should be. (Not just because I'm going to be referencing it a lot here, but also because... well... you should be.) It's a simple grid with two axes, one for "urgency" and once for "importance." And the combinations of those axes result in four well defined quadrants:



Imagine for a moment that any given work item, anything you do as part of your job, falls into one of those four quadrants. For each item, where do you think it goes? Where do you want it to go? Which quadrant contains the things you want to work on? Which contains the things you want to avoid? The grid makes those distinctions pretty clear:
  • Quadrant 1, Manage: These are the things that need to happen, and they need to happen right now. Production systems have errors and downtime, and fixing them is an immediate need. There are emergencies to resolve, fires to put out, etc. This is where critical decisions need to be made and success needs to be reached. Of course, you don't want to always be here because it means you're always in a hurry, always pressured.
  • Quadrant 2, Focus: These are the things that need to happen, but not right away. Investments in the future of the company, improvements to prepare for long-term strategic goals, etc. This is where you want to spend your time. This is where you work on rewarding things without being yelled at and pressured to cut corners.
  • Quadrant 3, Avoid: These are things which need to be fixed right now, but shouldn't even be a problem in the first place. This is where you do not want to be. This is the quadrant of work where you're pressured for results on things nobody actually cares about.
  • Quadrant 4, Limit: This is a fun little quadrant of things you might like to do but which don't necessarily provide measurable value to the business. (Google made a fortune off of this quadrant by encouraging employees to create their own projects, just FYI.) You want to spend some time here, but not a whole lot. Too much time here means nobody's doing anything important to the business.
Now let's think back to that problem of trying to convince "them" that you are "right." How does this grid help us? Well, now that you know where your work items belong on this grid, try also to imagine where your proposed work items belong. That thing you're trying to convince management to invest in... Where does it fit on the grid? And where do "they" think it fits? Step outside of the debate with "them" for a moment and imagine where on this grid you think they are, and where on this grid they think you are. Clearly you both want to be in Quadrant 2 (Focus). But neither of you thinks the other person is there. So how do you convince them that you are in Quadrant 2?
Tailor your position, and your call to action, based on the axis of movement from where they think you are to where they want you to be.
Let's take a classic example. You're working at a company with an established production system. As an engineer, you see all the problems. The code is a mess, the dependencies are unmanaged, support is difficult and there's too much of it... Technical debt is piling up ever higher. But the thing is, the system works. The business is making money. So, as often happens in these situations, there's no incentive from "them" to invest in improvements to a system that already does what they need it to do. "If it ain't broke, don’t fix it."

But you're the lead software engineer. You're the guy they hired for the specific purpose of ensuring this system continues to work and can continue to meet changing business needs. So why aren't they listening to you when you explain the things that need to be fixed? Because they see you in the wrong quadrant...







You think they're in Quadrant 1, only seeing the immediate concerns and waiting for something to "become a problem" before they solve it. They think you're in Quadrant 4, trying to convince the company to invest in something that is neither urgent nor important. How do you fix that?

Consider how these debates often play out. You talk about time tables, about how things are going to fail if we don't act soon. You're appealing to urgency in this argument. But if they think you're in Quadrant 4, where does an appeal to urgency make it look like you're going?



In that case it's pretty clear why they're against the idea. They see you as moving into the very quadrant where nobody should ever be. The argument they might present in response is to stress how your goals don't necessarily align with the business. How you're not seeing the big picture. What does that look like?



Clearly that's equally distasteful to you. You see them as moving into the very quadrant where nobody should ever be. The debates aren't reaching a middle ground because both sides are moving along different axes, appealing to entirely different measures of the decision. Assuming that you can't change their position, you need to instead present your position in a way which aligns with theirs.

In the example we've been discussing, this means appealing to importance rather than urgency. You and I both know that these fixes to the system need to happen soon, perhaps even now. But that's not what's being discussed. So table the discussion on urgency until another time. If the decision-maker you're trying to convince is varying along the axis of importance, then focus on that axis. Appeal to the importance. In this example that could be a matter of discussing the operational costs of support, or discussing the strategic goals of the system (scaling, additional features, etc.) and what it would take to achieve those goals. What might a goal cost in the current system? What might it cost with some fixes in place first? What might those fixes cost? And so on.



Since "they" think that "they" are already in Quadrant 2, if "you" can convince "them" that you are driving toward Quadrant 2 then that puts you both on common ground. That drives toward a mutually agreeable course of action. Tailor your position to the direction they want you to move, not the direction you think they're moving. This helps bridge that gap from "us vs. them" to "us and them" which is a much more productive team dynamic.

Friday, June 20, 2014

Fun with Expression Trees

A common pattern I find in my MVC/WebAPI code is that my controller actions usually open a unit of work (or just a read-only repository therein), perform some query or insert/update something, and then return. It's a simple thing to expect a controller action to do, after all. But sometimes I come across an entity model where it's a bit difficult to put much of the logic on the model, and instead it ends up in the controller actions.

For example, imagine a system where your models are a fairly complex graph of objects, versions of those objects, dynamic properties in the form of other child objects, versions therein as well, etc. It's a common pattern when one builds a framework in which to configure an application, as opposed to building just an application itself.

Now in this system, imagine that you want all of your queries against "Entity" models to always ever return just the ones which have not been soft-deleted. (Viewing soft-deleted records would be a special case, and one we haven't built yet.) Well, if you have a lot of queries against those entities in their repository, those queries are all going to repeat the same ".Where()" clause. Perhaps something like this:

return someRepository.Entities
                     .Where(e => e.Versions
                                  .OrderByDescending(v => v.VersionDate)
                                  .FirstOrDefault()
                                  .Deleted != true))
                     .Where(// etc.

That is, you only ever want Entity records where the most recent Version of that Entity is not in a "Deleted" state. It's not a lot of code (at least, not in this simplified example), but it is repeated code all over the place. And for more complex examples, it's a lot of repeated code. And more importantly than the repetition, it's logic which conceptually belongs on the model. A model should be aware of whether or not it's in a "Deleted" state. The controller shouldn't necessarily care about this, save for just invoking some logic that exists on the model.

At first one might simply add a property to the Entity model:

public bool IsDeleted
{
    get { return Versions.OrderByDescending(v => v.VersionDate)
                         .FirstOrDefault()
                         .Deleted != true; }
}

Then we might use it as:

return someRepository.Entities
                     .Where(e => !e.IsDeleted)
                     .Where(// etc.

That's all well and good from an object oriented perspective, but if you're using Entity Framework (and I imagine any number of other ORMs) then there's a problem. Is the ORM smart enough to translate "IsDeleted" to run it on the database? Or is it going to have to materialize every record first and then perform this ".Where()" clause in the code?  (Or just throw an error and not run the query at all?) Most likely the latter (definitely the latter with Entity Framework in this case), and that's no good.

We want as much query logic as possible to run on the database for a number of reasons:
  • It's less data moving across the wire.
  • It's a smaller memory footprint for the application.
  • SQL Server is probably a lot better at optimizing queries than any code you or I write in some random web application.
  • It's a lot easier and more standard to horizontally scale a SQL Server database than a custom application.
So we really don't want to materialize all of the records so that our object oriented models can perform their logic. But we do want the logic itself to be defined on those models because, well, object oriented. So what we need on the model isn't necessarily a property, what we need is an expression which can be used in a Linq query.

A first pass might look something like this:

public static Func<Entity, bool> IsNotDeleted = e => e.Versions
                                                      .OrderByDescending(v => v.VersionDate)
                                                      .FirstOrDefault()
                                                      .Deleted != true;

Which we can then use as:

return someRepository.Entities
                     .Where(Entity.IsNotDeleted)
                     .Where(// etc.

This is a good first step. However, if you profile the SQL database when this executes you'll find that the filtering logic still isn't being applied in the SQL query, but rather still in-memory in the application. This is because a "Func<>" doesn't get translated through Linq To Entities, and remains in Linq To Objects. In order to go all the way to the database, it needs to be an "Expression<>":

public static Expression<Func<Entity, bool>> IsNotDeleted = e => e.Versions
                                                                  .OrderByDescending(v => v.VersionDate)
                                                                  .FirstOrDefault()
                                                                  .Deleted != true;

Same code, just wrapped in a different type. Now when you profile the database you'll find much more complex SQL queries taking place. Which is good, because as I said earlier SQL Server is really good at efficiently handling queries. And the usage is still the same:

return someRepository.Entities
                     .Where(Entity.IsNotDeleted)
                     .Where(// etc.

Depending on how else you use it though, you'll find one key difference. The compiler wants to use it on an "IQueryable<>", not things like "IEnumerable<>" or "IList<>". So it's not a completely drop-in replacement for in-code logic. But with complex queries on large data sets it's an enormous improvement in query performance by offloading the querying part to the database engine.

There was just one last catch while I was implementing this. In some operations I want records which are "Deleted", and in some operations I want records which are not "Deleted". And obviously this doesn't work:

return someRepository.Entities
                     .Where(!Entity.IsNotDeleted)
                     .Where(// etc.

How should one invert the condition then? I could create a second expression property called "IsDeleted", but that's tacky. Not to mention it's still mostly repeated logic that would need to be updated in both places should there ever be a change. And honestly, even this "IsNotDeleted" bothers me from a Clean Code perspective because positive conditionals are more intuitive than negative conditionals. I should have an "IsDeleted" which can be negated. But how?

Thanks to some help from good old Stack Overflow, there's a simple way. And it all comes down to, again, expression trees. Essentially what's needed is an extension which wraps an expression in a logical inverse. This wrapping of the expression would continue through the expression tree until it's translated at the source (SQL Server in this case). Turns out to be a fairly simple extension:

public static Expression<Func<T, bool>> Not<T>(this Expression<Func<T, bool>> f)
{
    return Expression.Lambda<Func<T, bool>>(Expression.Not(f.Body), f.Parameters);
}

See, while there's no ".WhereNot()" or ".Not()" in our normal Linq extensions, there is one for Expressions. And now with this we can wrap our expression. First let's make it a positive condition:

public static Expression<Func<Entity, bool>> IsDeleted = e => e.Versions
                                                               .OrderByDescending(v => v.VersionDate)
                                                               .FirstOrDefault()
                                                               .Deleted == true;

Now let's get records which are deleted:

return someRepository.Entities
                     .Where(Entity.IsDeleted)
                     .Where(// etc.

And records which are not deleted:

return someRepository.Entities
                     .Where(Entity.IsDeleted.Not())
                     .Where(// etc.

Profile the database again and we see that all of the logic is still happening SQL-side. And for the inverted ones, the generated SQL query just wraps the whole condition and negates it exactly as we'd expect it to.

Now, we can still have our calculated properties on our models and we can still do a lot with those models in memory once they're materialized from the underlying data source. But in terms of just querying the data, where performance is a concern (which isn't always, admit it), having some expression trees on our models allows us to still encapsulate our logic a bit while making much more effective use of the ORM and database.

Wednesday, May 21, 2014

Continuous Integration with TFS and ClickOnce

My current project is pretty fast-paced, so we need some good infrastructure to keep mundane concerns out of our way. As an advocate of eliminating cruft in the development process, I naturally wanted to implement a fully-automated continuous integration setup with building, testing, and publishing of the applications involved. Everybody's done this plenty of times with web applications, but it turns out that it's not quite so common with databases and ClickOnce applications. Since all three are included in this project, this is as good a time as any to figure out how to unify them all.

First some infrastructure... We're using Visual Studio 2013, SQL Server (different versions, so standardizing on 2008R2 as a target), and TFS. (I'm actually not certain about the TFS version. It's definitely latest or close to it, but not really being a "TFS guy" I don't know the specifics. It just works for what we need, I know that much. Beyond that, the client owns the actual TFS server and various controllers and agents.)

The entire solution consists of:
  • A bunch of class libraries
  • A WebAPI/MVC web application
  • A WPF application
  • A bunch of test projects
  • A database project (schema, test data)
The goals for the build server are:
  • A continuous integration build which executes on every check-in. (We're actually using gated check-ins too. I don't like gated check-ins, but whatever.)
  • A test build which executes manually, basically any time we want to deliver something to QA.
And each build should:
  • Compile the code
  • Execute the tests
  • Deploy the database
  • Deploy the web application (with the correct config file)
  • Deploy the ClickOnce WPF application (with the correct config file)
Some of this is pretty much out-of-the-box, some of it very much is not. But with a little work, it's just about as simple as the out-of-the-box stuff. So let's take a look at each one...

Compile The Code

This one is as out-of-the-box as it gets. I won't go into the details of creating a build in TFS, there's no shortage of documentation and samples of that and it's pretty straightforward. One thing I did do for this process, however, was explicitly define build configurations in the solution and projects for this purpose. We're all familiar with the default Debug and Release configurations. I needed a little more granularity, and knew that the later steps would be a lot easier with distinct configurations, so I basically deleted the Release configuration from everything and added a CI and a Test configuration. For now all of their settings were directly copied from Debug.

I used the default built template, and set each respective build (CI and Test) to build the solution with its configuration (Any CPU|CI and Any CPU|Test). Simple.

Execute The Tests

Again, this one is built-in to the TFS builds. Just enable automated tests with the build configuration and let it find the test assemblies and execute them.

Here's where I hit my first snag. I saw this one coming, though. See, I'm a stickler for high test coverage. 100% ideally. (Jason, if you're reading this... let it go.) We're not at 100% for this project (yet), but we are pretty high. However, at this early stage in the project, a significant amount of code is in the Entity Framework code-first mapping files. How does one unit test those?

The simplest way I found was to give the test assembly an App.config with a valid connection string and, well, use the mappings. We're not testing persistence or anything, just mapping. So the simplest and most direct way to do that is just to open a unit of work (which is just a wrapper for the EF context), interact with some entities, and simply dispose of the unit of work without committing it. If valid entities are added to the sets and no exceptions are thrown, the mappings worked as expected. And code coverage analysis validates that the mappings were executed during the process.

However, this is technically an integration test in the sense that EF requires the database to exist. It does some inspection of that database for the initial mappings. That's kind of what we're testing, so we kind of need a database in place. Perhaps we could write some custom mock that pretends to be a SQL database, but that sounds overly-complicated. For the simplest approach, let's just see if we can deploy the database as part of the build. The upside is that this will validate the database schema as part of the build anyway, which is just more automated testing. And automated testing is a good thing.

So...

Deploy The Database

A lot has changed in SQL Server projects between Visual Studio 2010 and Visual Studio 2012/2013. The project itself doesn't respect build configurations like they used to. But there's a new mechanism which effectively replaces that. When you right-click on the database project to publish it, you can set all of your options. Then you can save those options in an XML file. So it seemed sensible to me to save them in the project, one for each build configuration. (From then on, publishing from within Visual Studio just involves double-clicking on the XML file for that publish configuration.)

These XML files contain connection strings, target database names, any SQL command variables you want to define, and various options for deployment such as overwriting data without warning or doing incremental deploys vs. drop-and-create deploys. Basically anything environment-specific about your database deployments.

Now we need the TFS builds to perform a database publish. Since TFS builds are basically a TFS workflow surrounding a call to MSBuild, adding MSBuild arguments in the build configuration seemed like a simple way to perform this with minimal effort. First I included the Publish target for the build:
/t:Build /t:Publish
Then I also needed to specify to deploy the database and which publish file to use:
/t:Build /t:Publish /p:DeployOnBuild=true /p:SqlPublishProfilePath=CI.publish.xml
I'm actually not entirely sure at this point if all of those are necessary, but it works well enough. We're targeting both Build and Deploy for the build actions and setting a couple of project properties:
  • DeployOnBuild - This is the one about which I'm not entirely certain. This setting doesn't exist in my project files in source control, but it seems to be the setting needed for the SQL Server project to get it to publish. (Or maybe it's used by one of the other projects by coincidence? This was all a bit jumbled together while figuring it out so that's certainly possible.)
  • SqlPublishProfilePath - This is a setting in the SQL Server project file to tell it which of those XML files to use for its publish settings.
This executes as part of the MSBuild step and successfully deploys to the target CI database (or fails if the code is bad, which is just as useful a result), which means that the updated database is in place and ready by the time the TFS workflow reaches the test projects. So when the unit (and integration) tests execute, the CI database is ready for inspection by Entity Framework. All I needed to do was add an App.config to that particular unit test project with the connection string for the CI database.

But wait... What happens when we want to run those tests locally? If we change the connection string in the App.config then we run the risk of checking in that change, which would break the CI build for no particular reason. (Side note: I loathe when developers say "I'll just keep that change locally and not check it in." They always check it in at some point and break everybody else. Keep your team's environments consistent, damn it.) And App.configs don't have transforms based on build configuration like Web.configs do. (Side note: What the crap, Microsoft? Seriously, why is this not a thing? Web.config transforms have been around for, like, a billion years.)

We're going to need to include the correct configs for the WPF application deployments in a later step, so let's add a step...

Use A Different App.config

There are various solutions to perform transforms on an App.config. I tried a few of them and didn't much care for any of them. The most promising one was this tool called SlowCheetah, which came highly recommended by somebody with lots of experience in these sort of things. But for reasons entirely unknown to me, I just couldn't get the damn thing to work. I'm sure I was missing a step, but it wasn't obvious so I continued to look for other solutions.

We can use post-build xcopy commands, but we really don't want to do that. And based on my experience with projects which use that option I guarantee it will cause problems later. But one component of this solution does make sense... Keeping the App.configs in separate files for each build configuration. It'll likely be easier than trying to shoehorn some transform into the process.

After much research and tinkering, I found a really simple and sensible solution. By manually editing the items in the project file I can conditionally include certain files in certain build configurations. So first I deleted the Item entries for the App.config and its alternates from the csproj file, then I added this:

<ItemGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
  <None Include="App.config" />
</ItemGroup>
<ItemGroup Condition=" '$(Configuration)|$(Platform)' == 'CI|AnyCPU' ">
  <None Include="Configs\CI\App.config" />
</ItemGroup>
<ItemGroup Condition=" '$(Configuration)|$(Platform)' == 'Test|AnyCPU' ">
  <None Include="Configs\Test\App.config" />
</ItemGroup>

Notice the alternate App.config files in their own directories. The great thing about this approach is that Visual Studio doesn't respect the Condition attributes and shows all of the files. Which is great, because as a developer I want to be able to easily edit these files within the project. But when MSBuild comes along, it does respect the Condition attributes and only includes the files for the particular build configuration being built.

So now we have App.configs being included properly for each build configuration. When running tests locally, developers' Visual Studios will use the default App.config in the test project. When building/deploying on the server, MSBuild will include the specific App.config for that build configuration, so tests pass in all environments without manual intervention. This will also come in handy later when publishing the ClickOnce WPF application.

Next we need to...

Deploy The Web Application

Web application deployments are pretty straightforward, and I've done them about a million times with MSBuild in the past. There are various ways to do it, and I think my favorite involves setting up MSDeploy on the target server. The server is client-owned though and I don't want to involve them with a lot of setup, nor do I want to install things there myself without telling them. So for now let's just stick with file system deployment and we can get more sophisticated later if we need to.

So to perform a file system deploy, I just create a network share for the set up IIS site and add some more MSBuild arguments:
/p:PublishProfile=CI /p:PublishUrl=\\servername\sharename\
The PublishUrl is, obviously, the target path on the file system. The PublishProfile is a new one to me, but it works roughly the same as the XML files for the database publishing. When publishing the web application from within Visual Studio the publish wizard saves profiles in the Properties folder in the project. These are simple XML files just as before, and all we need to do here is tell MSBuild which one to use. It includes the environment-specific settings you'd expect, such as the type of deploy (File System in this case) or whether to delete existing files first, etc. (Now that I'm looking at it again, it also includes PublishUrl, so I can probably update that in the XML files and omit it from the MSBuild arguments. This is a work in progress after all.)

At this point all we need to do is...

Deploy The ClickOnce WPF Application

This one was the least straightforward of them all, mainly for one particular reason. A ClickOnce application is mostly useful in that it detects new versions on the server and automatically upgrades the client when it executes. This detection is based on the version number of what's deployed, but how can we auto-increment that version from within the TFS build?

It auto-increments, well, automatically when you publish from within Visual Studio. And most people online seem content with that. But the whole point here is not to manually publish, but rather to have a continuously deployed bleeding edge version of the application which can be actively tested, as well as have as simple and automated a QA deployment as possible (queue a build and walk away, basically). So we need TFS and/or MSBuild to auto-increment the build number. And they weren't keen on doing that.

So first thing's first, let's get it publishing at all before worrying about the build number. Much like with the publish profiles for the web application, this involved walking through the wizard once in Visual Studio just to get the project settings in place. Once in place, we can examine what they are in the csproj file and set them accordingly in the MSBuild arguments:
/p:PublishDir=\\servername\anothersharename\ /p:InstallUrl=\\servername\ anothersharename\ /p:ProductName=HelloWorldCI
The PublishDir and InstallUrl are for ClickOnce to know where to put the manifest and application files from which clients will install the application. The ProductName is just any unique name by which the application is known to ClickOnce, which would end up being its name in the Start Menu and the installed programs on the computer. (At this time I'm actually not sure how to get multiple different versions to run side-by-side on a client workstation. I'm sure it involves setting some other unique environment-specific value in the project, I'm just not sure what.)

So now clients can install the ClickOnce application from the network share, it has the correct App.config (see earlier), and everything works. However, at this time it doesn't detect new versions unless we manually update the version number. And we don't like "manually" around here. Researching and tinkering to solve this led me down some deep, dark rabbit holes. Colleagues advised and assisted, much Googling and Stack Overflowing was done, etc. I was very close to defining custom Windows Workflow actions and installing them on the server as part of the build workflow to over-write the csproj file after running it through some regular expressions. This was getting drastically over-complicated for my tastes.

While pursuing this approach, the big question was where I would persist the incrementing number. It needed to exist somewhere because each build would need to know what the value is before it can increment it. And I really didn't like the idea of putting it in a database somewhere just to support this one thing. Nor did I like the idea of storing it in the csproj file or any other file under source control because that would result in inconsistencies as gated check-in builds are queued. Then it hit me...

We have an auto-incrementing number on TFS. The ChangeSet number.

Now, I've never really edited the build workflow templates before. (Well, technically that's not entirely true. I did make some edits to one while specifically following steps from somebody else's blog post in order to build a SharePoint project before. But it was more SharePoint than TFS else and I had no idea what I was actually doing. So I didn't retain much.) And as such I didn't really know what capabilities were in place or how to reference/assign values. But with a little tinkering and researching, I put together something really simple.

First, I added a step to the build workflow template just before the MSBuild step. It was a simple Assign workflow item, and the value it was changing was MSBuildArguments (which is globally available throughout the workflow). Logically it basically amounts to:

MSBuildArguments.Replace("$ChangeSet$", BuildDetail.SourceGetVersion.Replace("C", String.Empty))

That is, it looks for a custom placeholder in the arguments list called $ChangeSet$ and replaces it with the ChangeSet number, which is also globally-available in the workflow as SourceGetVersion on the BuildDetail object. This value itself needs to have its "C" replaced with nothing, since ChangeSet numbers are prepended with "C". Now that I have the "persisted auto-incrementing" number, I just need to apply it to the project settings. And we already know how to set values in the csproj files:
/p:ApplicationRevision=$ChangeSet$ /p:MinimumRequiredVersion=1.0.0.$ChangeSet$
And that's it. Now when the ClickOnce application is published, we update the current version number as well as the minimum required version to force clients to update. Somewhere down the road we'll likely need to update the first three digits in that MinimumRequiredVersion value, but I don't suspect that would be terribly difficult. For now, during early development, this works splendidly.

So at this point what we have is:

  • Explicit build configurations in the solution and projects
  • XML publish profiles for the database project and the web application project
  • Web.config transforms and App.config alternate files
  • Conditional items in the csproj for the App.config alternate files, based on build configuration
  • A workflow step to replace MSBuild argument placeholders with the ChangeSet number
  • A list of MSBuild arguments:
    • /t:Build /t:Publish /p:DeployOnBuild=true /p:PublishProfile=CI /p:SqlPublishProfilePath=CI.publish.xml /p:PublishUrl=\\servername\sharename\ /p:PublishDir=\\servername\someothersharename\ /p:InstallUrl=\\servername\someothersharename\ /p:ApplicationRevision=$ChangeSet$ /p:MinimumRequiredVersion=1.0.0.$ChangeSet$ /p:ProductName=HelloWorldCI
Replace "CI" with "Test" and we have the Test build. If we want to create more builds (UAT? Production?) all we need to do is:

  • Create the build configurations
  • Create the config files/transforms
  • Create the publish profiles
  • Set up the infrastructure (IIS site, network shares)
  • Create a new TFS build with all the same near-default settings and just replace "CI" in the MSBuild arguments with the new configuration
And that's it. The result is a fully-automated continuous integration and continuous deployment setup. Honestly, I've worked in so many environments where a build/deploy consisted of a long, involved, highly manual, and highly error-prone process. Developers and IT support were tied up for hours, sometimes days, trying to get it right. What I have here, with a few days of research and what boils down to an hour or two of repeatable effort, is a build/deploy process which involves:

  • Right-click on the build definition in TFS
  • Select "Queue New Build"
  • Go grab a sandwich and take a break
I love building setups like this. My team can now accomplish in two mouse clicks what other teams accomplish in dozens of man-hours.

Monday, April 28, 2014

Composition... And Coupling?

Last week I had an interesting exchange with a colleague. We were discussing how some views and view models are going to interact in a WPF application we’re building, and I was proposing an approach which involves composition of models within parent models. Apparently my colleague is vehemently opposed to this idea, though I’m really not certain why.

It’s no secret that the majority of my experience is as a web developer, and in ASP.NET MVC I use composite models all the time. That is, I may have a view which is a host of several other views and I bind that view to a model which is itself a composition of several other models. It doesn’t necessarily need to be a 1:1 ratio between the views and the models, but in most clean designs that ends up happening if for no other reason than both the views and the models represent some atomic or otherwise discreet and whole business concept.

The tooling has no problem with this. You pass the composite model to the view, then in the view where you include your “partial” views (which, again, are normal views from their own perspective) you supply to that partial view the model property which corresponds to that partial view’s expected model type. This works quite well and I think distributes functionality into easily re-usable components within the application.

My colleague, however, asserted that this is “tight coupling.” Perhaps there’s some aspect of the MVVM pattern with which I’m unaware? Some fundamental truth not spoken in the pattern itself but known to those who often use it? If there is, I sure hope somebody enlightens me on the subject. Or perhaps it has less to do with the pattern and more to do with the tooling used in constructing a WPF application? Again, please enlighten me if this is the case.

I just don’t see the tight coupling. Essentially we have a handful of models, let’s call them Widget, Component, and Thing. And each of these has a corresponding view for the purpose of editing the model. Now let’s say our UI involves a single large “page” for editing each model. Think of it like stepping through a wizard. In my mind, this would call for a parent view acting as a host for the three editor views. That parent view would take care of the “wizard” bits of the UX, moving from one panel to another in which the editor views reside. Naturally, then, this parent view would be bound to a parent view model which itself would consist of some properties for the wizard flow as well as properties for each type being edited. A Widget, a Component, and a Thing.

What is being tightly coupled to what in this case?

Is the parent view model coupled to the child view models? I wouldn’t think so. Sure, it has properties of the type of those view models. In that sense I suppose you could say it has a dependency on them. But if we were to avoid such a dependency then we wouldn’t be able to build objects in an object-oriented system at all, save for ones which only recursively have properties of their own type. (Which would be of very limited use.) If a Wizard shouldn’t have a property of type Widget then why would it be acceptable for it to have a property of type string? Or int? Those are more primitive types, but types nonetheless. Would we be tightly coupling the model to the string type by including such a property?

Certainly not, primarily because the object isn’t terribly concerned with the value of that string. Granted, it may require specific string values in order to exhibit specific behaviors or perform specific actions. But I would contend that if the object is provided with a string value which doesn’t meet this criteria it should still be able to handle the situation in some meaningful, observable, and of course testable way. Throw an ArgumentException for incorrect values, silently be unusable for certain actions, anything of that nature as the logic of the system demands. You can provide mock strings for testing, just like you can provide mock Widgets for testing. (Though, of course, you probably wouldn’t need a mocking framework for a string value.)

Conversely, are the child view models tightly coupled to the parent view model? Again, certainly not. The child view models in this case have no knowledge whatsoever of the parent view model. Each can be used independently with its corresponding view regardless of some wizard-style host. It’s by coincidence alone that the only place they’re used in this particular application (or, rather, this particular user flow or user experience) is in this wizard flow. But the components themselves are discreet and separate and can be tested as such. Given the simpler example of an object with a string property, I think we can agree that the string type itself doesn’t become coupled to that object.

So… Am I missing something? I very much contend that composition is not coupling. Indeed, composition is a fundamental aspect of object-oriented design in general. We wouldn’t be able to build rich object systems without it.

Monday, April 21, 2014

Say Fewer, Better Things

Last week, while beginning a new project with a new client, an interesting observation was made of me by the client. As is usual with a new project, the week was filled with meetings and discussions. And more than once the project sponsor explicitly said to me, "Feel free to jump in here as well." Not in a snarky way mind you, he just wanted to make sure I'm not waiting to speak and that my insights are brought to the group. At one point he said, "I take it you're the strong silent type, eh?"

Well, I like to think so.

In general it got me thinking, though. It's no secret that I'm very much an introvert, and that's okay. So for the most part I have a natural tendency to prefer not speaking over speaking. But the more I think about it, the more I realize that there's more to it than that.

As it turns out, in a social gathering I'm surprisingly, well, social. I'm happy to crack a joke or tell a story, as long as I don't become too much a center of attention. If I notice that happening, I start to lose my train of thought. In small groups though it's not a problem. In a work setting, however, I tend not to jump in so much. It's not that I'm waiting for my turn to speak, it's that I'm waiting for my turn to add something of value.

This is intentional. And I think it's a skill worth developing.

I've had my fair share of meetings with participants who just like to be the center of the meeting. For lack of a better description, they like to hear themselves talk. The presence of this phenomenon varies wildly depending on the client/project. (Luckily my current project is staffed entirely by professionals who are sharp and to the point, for which I humbly thank the powers that be.) But I explicitly make it a point to try not to be this person.

Understand that this isn't because I don't want to speak. This is because I do want to listen. I don't need (or even really want) to be the center of attention. I don't need to "take over" the meeting. My goal is to simply contribute value. And I find that I can more meaningfully contribute value through listening than through speaking.

I'll talk at some point. Oh, I will definitely talk. And believe me, I'm full of opinions. But in the scope of a productive group discussion are all of those opinions relevant? Not really. So I can "take that offline" in most cases. A lot of that, while potentially insightful and valuable, doesn't necessarily add value to the discussion at hand. So rather than take the value I already know and try to adjust the meeting/discussion/etc. to fit my value, I'd rather absorb the meeting/discussion/etc. and create new value which I don't already know which targets the topic at hand.

That is, rather than steer the meeting toward myself, I'd rather steer myself toward the meeting. And doing so involves more listening than talking. Sometimes a lot more.

In doing so, I avoid saying too much. Other people in the meeting can point out the obvious things, or can brainstorm and openly steer their trains of thought. What I'll do is follow along and observe, and when I have a point to make I'll make it. I find this maximizes the insightfulness and value of my points, even if they're few and far between. And that's a good thing. I'd rather be the guy who made one point which nobody else had thought of than the guy who made a lot of points which everybody else already knew. The latter may have been more the center of attention, but the former added more value.

Listen. Observe. Meticulously construct a mental model of what's being discussed. Examine that model. And when the room is stuck on a discussion, pull from that model a resolution to that discussion. After all, concluding a discussion with a meaningful resolution is a lot more valuable than having participated in that discussion with everybody else.

Thursday, April 10, 2014

Agile and Branching

I've recently interacted with an architect who made a rather puzzling claim in defense of his curious and extraordinarily inefficient source control branching strategy. He said:
"One of the core principles of agile is to have as many branches as possible."
I didn't have a reply to this statement right away. It took a while for the absurdity of it to really sink in. He may as well have claimed that a core principle of agile is that the oceans are made of chocolate. My lack of response would have been similar, and for the same reason.

First of all, before we discuss branching in general, let's dispense with the provable falsehoods of his statement. The "core principles" of agile are, after all, highly visible for all to see:
We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:
  • Individuals and interactions over processes and tools
  • Working software over comprehensive documentation 
  • Customer collaboration over contract negotiation 
  • Responding to change over following a plan 
That is, while there is value in the items on the right, we value the items on the left more.
Pretty succinct. So let's look at them one-by-one in this case:



Individuals and interactions over processes and tools

The intent here is pretty clear. This "core principle" is to favor the team members and how they interact, not to favor some particular tool or process. If a process which works well with one team doesn't work well with another team, that other team shouldn't adopt or adhere to that process. Put the team first, not the process or the tool.

Source control is a tool. Branching is a process. To favor such things despite a clear detriment to the team and to the business value is to explicitly work against the very first principle of agile. Indeed, not only does agile as a software development philosophy make no claim about tools and processes, it explicitly says not to do so.

Working software over comprehensive documentation

In this same environment I've often heard people ask that these processes and strategies at least be documented so that others can understand them. While they may put some balm on the wound in this company, it's not really a solution. If you document a fundamentally broken process, you haven't fixed anything.

The "core principle" in this case is to focus on delivering a working product. Part of doing that is to eliminate any barriers to that goal. If the developers can't make sense of the source control, that's a barrier.

Customer collaboration over contract negotiation

In this case the "customer" is the rest of the business. The "contract" is the requirements given to the development team for software that the business needs. This "negotiation" takes the form of any and all meetings in which the development team and the business team plan out the release strategy so that it fits all of the branching and merging that's going to take place.

That negotiation is a lie, told by the development team, and believed by the rest of the business. There is no need for all of this branching and merging other than to simply follow somebody's process or technical plan. It provides no value to the business.

Responding to change over following a plan

And those plans, so carefully negotiated above, become set in stone. Deviating from them causes significant effort to be put forth so that the tools and processes (source control and branching) can accommodate the changes to the plan.



So that's all well and good for the "core principles" of agile, but what about source control branching? Why is it such a bad thing?


The problem with branching isn't the branching per se, it's the merging. What happens when you have to merge?

  • New bugs appear for no reason
  • Code from the same files changed by multiple people has conflicts to be manually resolved
  • You often need to re-write something you already wrote and was already working
  • If the branch was separated for a long time, you and team members need to re-address code that was written a long time ago, duplicating effort that was already done
  • The list of problems goes on and on...
Merging is painful. But, you might say, if the developers are careful then it's a lot less painful. Well, sure. That may be coincidentally true. But how much can we rely on that? Taken to an extreme to demonstrate the folly of it, if the developers were "careful" then the software would never have bugs or faults in the first place, right?

Being careful isn't a solution. Being collaborative is a solution. Branches means working in isolated silos, not interacting with each other. If code is off in a branch for months at a time, it will then need to be re-integrated with other code. It already works, but now it needs to be made to work again. If we simply practice continuous integration, we can make it work once.

This is getting a bit too philosophical, so I'll step back for a moment. The point, after all, isn't any kind of debate over what have become industry buzz-words ("agile", "continuous integration", etc.) but rather the actual delivery of value to the business. That's why we're here in the first place. That's what we're doing. We don't necessarily write software for a living. We deliver business value for a living. Software is simply a tool we use to accomplish that.

So let's ask a fundamental question...
What business value is delivered by merging branched code?
The answer is simple. None. No business value at all. Unless the actual business model of the company is to take two pieces of code, merge them, and make money off of that then there is no value in the act of merging branches. You can have those meetings and make those lies all you like, but all you're doing is trying to justify a failing in your own design. (By the way, those meetings further detract from business value.)

Value comes in the form of features added to the system. Or bugs fixed in the system. Or performance improved in the system. And time spent merging code is time not spent delivering this value. It's overhead. Cruft. And the "core principles" of agile demand that cruft be eliminated.

The business as a whole isn't thinking about how to best write software, or how to follow any given software process. The business as a whole is thinking about how to improve their products and services and maximize profit. Tools and processes related to software development are entirely unimportant to the business. And those things should never be valued above the delivery of value to the business.

Thursday, April 3, 2014

RoboCop-Driven Development

How many cooks are in your kitchen? Or, to use a less politically-correct analogy, what's the chief-to-indian ratio in your village? If you're trying to count them, there are too many.

There are as many ways to address this problem as there are stars in the sky. (Well, I live in a populated area, so we don't see a whole lot of stars. But there are some, I'm sure of it.) There can be a product owner at whom the buck stops for requirements, there can be quick feedback cycles to identify and correct problems as early as possible, there can be drawn-out analysis and charting of requirements to validate them, prototyping to validate them, etc.

But somehow, we often find ourselves back in the position that there are too many cooks in the kitchen. Requirements are coming from too many people (or, more appropriately, too many roles). This doesn't just end up killing the Single Responsibility Principle, it dances on the poor guy's grave.

Let me tell you a story. Several years ago I was a developer primarily working on a business-to-business web application for a private company. The company, for all its failings, was a stellar financial success. Imagine a dot-com-era start-up where the money never ran dry. I think it was this "can't lose" mentality which led to the corporate culture by which pretty much anybody could issue a requirement to the software at any time for any reason.

One day a requirement made its way to my inbox. On a specific page in the application there was a table of records, and records which meet a specific condition need to be highlighted to draw the attention of the user. The requirement was to make the text of those table rows red. A simple change to say the least. And very much a non-breaking change. So there wasn't really a need to involve the already over-burdened QA process. The change was made, developer-tested, and checked in. Another developer could even have code-reviewed it, had that process been considered necessary.

Around the same time, a requirement from somebody else in the business made its way to another developer's inbox. On a specific page in the application there was a table of records, and records which meet a specific condition need to be highlighted to draw the attention of the user. The requirement was to make the background of those table rows red. A simple change to say the least. And very much a non-breaking change. So there wasn't really a need to involve the already over-burdened QA process. The change was made, developer-tested, and checked in. Another developer could even have code-reviewed it, had that process been considered necessary.

Both requirements were successfully met. And, yes, the changes were successfully promoted to production. (Notice that I'm using a more business-friendly definition of the word "successfully" here.) The passive-aggressive side of me insists that there's no problem here. Edicts were issued and were completed as defined. The professional side of me desperately wants to fix this problem every time I see it.

Sometimes requirements, though individually well-written and well-intentioned, don't make sense when you put them together. Sometimes they can even be mutually exclusive. The psychology of this can be difficult to address. We're talking about two intelligent and reasonable experts in their field. Each individually put some measure of thought and care into his or her own requirement. That requirement makes sense, and it's necessary for the growth of the system to meet business needs. Imagine trying to explain to this person that their requirement doesn't make sense. Imagine trying to say, "You are right. And you over there are also right. But together you are wrong."

To a developer, that conversation always sounds like this:

Someone wanted the lines to be red. Somebody else wanted them to be drawn with specific inks. Somebody else wanted there to be seven of them. Somebody else wanted them to be perpendicular to each other. And so on and so forth. In the video all of those requirements came filtered through what appeared to be a product owner, but the concept is the same. And to a developer there isn't always a meaningful difference. "You" made this requirement can be the singular "you" or the plural "you", it's just semantics.

Remember that "business-friendly" definition of "successful"? That is, we can all work diligently to achieve every goal we set forth as a business, and still not get anywhere. The goals were "successfully" met, but did the product successfully improve? We did what we set out to do, and we pat ourselves on the back for it. All the while, nothing got done. Each individual goal was reached, but if you stand back and look at the bigger picture you clearly see that it was all a great big bloody waste of time.

(If your hands-on developers can see the big picture and your executives can't, something's wrong.)

Creators of fiction have presented us with this group-think phenomenon many times. My personal favorite, for reasons unknown to me, is RoboCop 2. If you'll recall, RoboCop had a small set of prime directives. These were his standing orders which he was physically incapable of violating. They were hard-coded into his system, and all subsequent orders must at least maintain these:
  1. Serve the public trust
  2. Protect the innocent
  3. Uphold the law
  4. (There was a fourth, "classified" directive placed there by the suits who funded his construction, preventing him from ever acting against them personally. That was later removed.)
There's lots of precedent for "directives" such as these. Much of the inspiration clearly comes from Isaac Asimov's three laws of robotics. Hell, even in real life we have similar examples, such as the general orders presented to soldiers in the US Army during basic training. It's an exercise in simplicity. And, of course, life is never that simple. Asimov's stories demonstrated various ways in which the Three Laws didn't work as planned (or worked completely differently than planned, such as when a robot invented faster-than-light travel). And I'm sure any soldier will tell you some stories of real life after training.

But then fast-forward to RoboCop 2, when the special interest groups got involved. It became political. And his 3 directives quickly ballooned into over 300 directives. Each of which, I'm sure, was well-intentioned. But when you put all of them together, he became entirely unable to function. He had too many conflicting requirements. Too many individual "rights" added up to one great big "wrong."

(Parents of the world would be perplexed. Two wrongs don't make a right, but two rights can very easily make a wrong.)

Is your business environment a dizzying maelstrom of special interest groups? Do they each have an opportunity to individually contribute to the product? Is this helping the product? Which is more important to the business... Appeasing every group individually or building a compelling product?

Tuesday, April 1, 2014

Scientists Have Been Doing TDD for Centuries

I've recently been watching Cosmos: A Spacetime Odyssey (and I don't have the words to describe how excellent it is), and the subject came up with my children about what science really is. Fundamentally, aside from all the various beliefs and opinions that people have, how can one truly know what is "science" and what is not.

The answer turns out to be very simple. The science is in the test. Ideas, opinions, even reason and logic themselves are all ancillary concepts to the test. If it isn't tested (or for whatever reason can't be tested, such as supernatural beliefs), then it isn't science. That isn't to say that it's wrong or bad, simply that it isn't science.
Science is the act of validating something with a test.
You can believe whatever you like. You can even apply vast amounts of reason and logic and irrefutably derive a very sensible conclusion about something. But if you don't validate it with a test, it's not science. It's conjecture, speculation. And as I like to say at work, "Speculation is free, and worth every penny."

A Google search on "the scientific method" (which I think we can agree is the core foundation of all scientific study) lands me on a site called Science Buddies, which presents this handy graphic to visualize the process:
This seems similar enough to what I remember from grammar school. Notice a few key points:
  • Attention is drawn to the bright green box. (Green means good, right?) That box is the center of the whole process. That box is the test being conducted.
  • The very first step is "Ask a Question." Not define what you think the answer is, not speculate on possible outcomes, but simply ask the question.
  • There's a cycle to the process, using the tests to validate each iteration of the cycle. The results of the tests are fed back into the process to make further progress.
This reminds me of another cycle of testing:
Even the coloring is the same. The first step, in red, is to ask the question. The central step, in green, is to run a test against the implementation to produce results, validating both the test and the implementation. The loop back step, in blue, is to adjust the system based on the results of the test and return to the start of the cycle, using the test and its results to further refine and grow the next test(s).
TDD is the Scientific Method applied to software development.
The two processes share core fundamental values. Values such as:

  • Begin by asking a question. The purpose of the process is not to answer the question, but to arrive at a means of validating an answer. Answers can change over time, the key value is in being able to validate those answers.
  • Tests must be repeatable. Other team members, other teams, automated systems, etc. must be able to set up and execute the test and observe the results. If the results are inconsistent then the tests themselves are not valid and must be refined, removing variables and dependencies to focus entirely on the subject being tested.
  • The process is iterative and cyclic. Value is attained not immediately, but increasingly over time. Validating something once is only the first step in the process, if the same thing can't be validated repeatedly then it wasn't valid in the first place, the initial result was a false positive. Only through repetition and analysis can one truly be confident in the result.
We've all heard the analogy that TDD is similar to double-entry bookkeeping. And that analogy still holds true. This one is just resonating with me a lot more at the moment. It's not just the notion that the code and the tests mutually confirm each other, but that the fundamental notion of being able to validate something with a repeatable test is critical to truly understanding that something. Anything else is just speculation.

Monday, March 24, 2014

Don't Judge Too Quickly

Writing software is a team effort. And as with any team effort, we have to work together. This includes the ever-dreaded notion of having to read, understand, and perhaps even maintain code written by somebody else. Maybe that "somebody else" is a close and trusted colleague, maybe it is a client-appointed counterpart, maybe it is a team member who left the team long ago. Regardless of who originated the code we're looking at, that developer is in some way a part of the overall team.

And let's be honest here... We don't like working on other people's code. It's not something to be ashamed of. After all software development is in many ways a creative endeavor, an ill-defined Venn intersection between creativity, engineering, salesmanship, sarcasm, and caffeine. And who wants to create on top of something that's already been created? A painter prefers a canvas to be blank. A sculptor prefers an unbroken stone. But these analogies extend only so far in our craft, and clinging too tightly to them detracts from the team.

So how do we approach existing ("legacy") code? Hopefully not with judgements and preconceptions.

Over the years I've heard lots of good advice from my manager. Recently he made a suggestion which particularly resonated with me, "Don't pass judgement on their code for at least a month." My initial internal reaction was to remind myself that "I know these code smells all too well" and that "I've had enough conversations with these guys to know that they have no idea what they're doing." The sheer arrogance of these thoughts quickly became apparent. Clearly I needed to do a little less deciding and a little more observing.

What can be observed is the history of the codebase. Why certain things were done certain ways, what other factors have been in place (including the factors which have long-since vanished after having left their discernible mark in the code). What becomes clear during this period is the subtle reminder that our clients and counterparts are not unintelligent, neither malicious nor insane, and that as reasonable people with reasonable goals they have time and again enacted reasonable decisions with the information they had. Maybe that information was incorrect or incomplete, or maybe that information has changed over time.

"This legacy code is a mess" is in every way a cop-out. It's how we throw in the towel, declaring to the world that we are not in a position to help. Even if it's correct, it's not productive. (This reminds me of another nugget of management wisdom I've heard, "Being right isn't good enough.") The productive approach is to understand the decisions of yesterday so that we can better advise on the decisions of today.

A significant reason why this advice resonates with me is because, during this "month of observation," a curious turning of the tables has occurred. While at this client, I've written some of the best code in my life. It's a small project, and one in which I was explicitly given freedom to demonstrate some technologies, patterns, and practices which could be applied elsewhere in the company's software.  It was both a solution to a simple problem as well as a proof of concept for solution patterns in general. And almost immediately after I completed development of this codebase the client hired a new Solution Architect who took one look at it and didn't like it.

It wasn't what he expected. Or, rather, it wasn't what he would have written. And because it didn't fit his mold, it was to him in every way a "legacy mess." How could this be? Patterns are documented, concerns are separated, test coverage is high, each individual fully-encapsulated component is elegant in its simplicity. (At least, I think so.) And it was everything he didn't want in an architecture.

Who is he to tell me that my baby is ugly? Well, who am I to tell any client that their baby is ugly?

What he hasn't been able to tell me is any meaningful reason why my code is "bad." If he could, I'd be happy to listen to his points. Any opportunity to learn and improve is welcome. But "I don't like it" isn't such an opportunity. So, following that same reasoning, what meaningful points can I provide the client about their code? Why don't I like it?

Those points can be productive and helpful, as long as they're backed by meaningful information. And that information starts with understanding the existing codebase and why each component is what it is. Despite years of experience, that understanding doesn't come from a cursory look at the code but rather from time spent "in the trenches" working, studying, observing. Doing so "for at least a month" sounds like as good a rule of thumb as any.

Monday, March 17, 2014

Dependency Injection Screencast

Here's a screencast of a presentation I've given a few times at local events, called "What Is Dependency Injection?" It's basically an intro-to-DI for an audience of developers (both new and experienced) who haven't been exposed to DI implementations but have been hearing about it, or perhaps developers who haven't used it but suddenly need to and want to know what it's all about.

Enjoy!

Friday, March 14, 2014

Refactoring Screencasts VI

Recently I've once again had the occasional opportunity to sit in a (mostly) quiet room and record some more screencasts, so here's the Dealing With Generalization series. Enjoy!

Pull Up Field

Pull Up Method

Pull Up Constructor Body

Push Down Method

Push Down Field

Extract Subclass

Extract Superclass

Extract Interface

Collapse Hierarchy

Form Template Method

Replace Inheritance With Delegation

Replace Delegation With Inheritance

Tuesday, March 11, 2014

On .on(), or Event Delegation Explained

Just as I'm quite sure there are straightforward concepts unfamiliar to me in technologies and architectures with which I have little or no experience, so too are there such concepts which I take for granted each day but which befuddle otherwise capable developers who simply haven't had exposure to them. (Of course, this one also befuddles developers who really should understand it by now, but I digress.) So perhaps I can "do the universe a solid" and attempt to share that which I have and, for the most part, have been oblivious to the fact that there are others who do not have it.

In this case, I'm speaking of jQuery's .on() function. We've all seen the recommendations to use that, though the more I see such recommendations on, say, Stack Overflow answers the more I discover that there's no shortage of developers who treat .on() as a magic spell. An incantation which, when invoked, solves their problem. Or at least should, and when it doesn't they are again confused.

Allow me to start by stating a couple of unshakable truths regarding this sorcery:
  1. .on() is not magic.
  2. Event handlers are attached to elements, not to selectors.
  3. When an event handler is attached to an element, the selector is evaluated once at that time and never again.
Some of that sounded pretty matter-of-fact as far as blanket statements go, and indeed perhaps I could wordsmith it better. But I wanted to sound that way, because I want to get your attention. You may point out that .on() uses event delegation which clearly means the selector is evaluated again at another time. However, if you know what you're talking about then you know that's kind of a misleading statement because you're referring to a different selector. And if you don't know what I meant by that statement then this article is for you.

First, let's consider the following code:

<button class="editButton">Edit</button>
<script type="text/javascript">
$('.editButton').click(function () {
  // handle the button click event
});
</script>

A very simple and straightforward jQuery event handler. When the button is clicked, the handler code runs. In the highly dynamic world of AJAX and dynamic DOM structures, however, this straightforward approach doesn't always work. A developer then asks, "Why doesn't my click event handler execute for edit buttons that are added to the page via AJAX?" A very reasonable novice question, often met with an equally novice answer... "Just use .on() instead."

This might lead the developer to try this approach:

<button class="editButton">Edit</button>
<script type="text/javascript">
$('.editButton').on('click', function () {
  // handle the button click event
});
</script>

Now the developer is using .on(), but he didn't solve the problem. The magic spell didn't work. So the question is re-asked and further clarification is provided, again in the form of an incantation, "Use 'document' and add a second selector." This exchange leads the developer here:

<button class="editButton">Edit</button>
<script type="text/javascript">
$(document).on('click', '.editButton', function () {
  // handle the button click event
});
</script>

This "works" in the strictest definition of the word. (That is, when comparing "works" with "doesn't work" as an equally vague description of events.) It solves the problem of giving a man a fish so that he may eat today. And in a world where quarterly earnings are valued above long-term sustainability that may be enough. But as developers we don't necessarily live in that world. We have to stomach it from time to time, but our world is about long-term sustainability. The code we write needs to be understandable by whoever has to support it, including ourselves. That is, the developer needs to understand why this incantation "works."

Let's switch gears for a moment and discuss the DOM and events therein. We're all familiar with the fact that when you click something, it raises a click event. (There are plenty of other events as well, but for the purpose of this discussion we'll just focus on the most common of them... click.) Some of us may also be familiar with the fact that events "bubble up" in the DOM, which is at the heart of how .on() performs its "magic." Consider an HTML structure:

<body>
  <div>
    <button class="editButton">Edit</button>
  </div>
</body>

When you click on the button, the button invokes its click event and any handlers for that event attached to that element are thus invoked. But then the event "bubbles up" the structure. After the button invokes it event, the div invokes its click event, invoking any handlers attached to the div. Then the body invokes its click event for handlers attached to it. The top-level html tag also then invokes its click event, and the document object at the very top of the DOM invokes its click event.

That's a lot of click events for a single click. So how does this relate to the "working" incantation above? In that code, look at the element to which we're actually binding the click event:

<button class="editButton">Edit</button>
<script type="text/javascript">
$(document).on('click', '.editButton', function () {
  // handle the button click event
});
</script>

We're binding it to the document. Let's assume for a moment that the HTML being dynamically changed via AJAX calls is the div and the button. This means we could also have bound to the body or the html (though I don't think I've ever seen that latter one in the wild) and it would still "work." This is because any element in the DOM, when raising this event, propagates the event upwards through the structure of the DOM. (Unless, of course, propagation is explicitly stopped.)

Indeed, every click anywhere in the DOM is going to invoke the click event handler(s) on the document object. This is where that other selector in .on() comes in. It's a filter for the event propagation, indicating to jQuery that this particular function should only be invoked when the originating element matches that selector. That selector is evaluated when the event is invoked, whereas the selector for assigning the event was evaluated only once when the page was loaded.

Had we done this, our event handler would execute for any click anywhere on the page:

<button class="editButton">Edit</button>
<script type="text/javascript">
$(document).on('click', function () {
  // handle the button click event
});
</script>

This is also why the developer's first attempt to use .on() "worked" in the sense that his initially loaded button(s) still invoked the handler, because it still attached that handler to the selected elements. But as new elements are added to the DOM, those elements have no handlers attached to them and so it "didn't work" for them.

So basically, the event originates at the clicked element (in this case a button) and travels upward through the HTML elements all the way to the top. That event can be "handled" anywhere along the way. The sorcery of .on() then is simply that it's handled further up the chain instead of at the element itself. This allows us to dynamically alter elements within the document without losing the handler, because even though newly added elements don't have their own handlers assigned they do still "bubble up" their events to parent elements.

It's common to attach these handlers to the document object, though any common parent element will "work." Let's say you have a containing div and within that div you add/remove elements. As long as the div element isn't removed then you can keep your event handler there. (For a large and complex page it's probably best not to put all handlers on the document object, if for no other reason than saner organization of code.) Any common parent over the dynamic elements will "work."

It's not magic, and it's important to understand which parts of the code are evaluated at what time. The selector which targets the element(s) on which we're assigning a handler is evaluated once and never again. The selector which filters the child element(s) which originated the event is evaluated each time the event occurs.