Friday, November 18, 2011

Math, Boys and Girls

Our CTO posted an interesting question on our Yammer feed yesterday:
In a mythical land, parents keep having children until a boy is born, and then they stop.
Without doing the math, does this result in more girls than boys on average, or more boys than girls?
Doing the math?  Man, I haven't "done the math" since school.  And there's a lot that I didn't retain, so I can't say for certain if I ever did this.  But one thing's for sure... I never approached math from this angle back in school.  You remember how it was, right?  There's a textbook, there are problems, you solve them, you're done.  Nothing really thought-provoking like this, at least for us non-math-majors.

So I couldn't really think of an equation or anything to approximate this.  Looking back at some of the responses now, it seems kind of obvious.  It's just not how I've approached problems in the past, so it's not how I think.  No, how I think is to write a program to simulate it:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

namespace RockysQuestion
{
    class Program
    {
        static void Main(string[] args)
        {
            while (true)
            {
                new Family();
                Console.WriteLine(
                    string.Format(
                        "Total Boys: {0} | Total Girls: {1}",
                        Globals.TotalBoys,
                        Globals.TotalGirls));
            }
        }
    }

    static class Globals
    {
        public static int TotalBoys { get; set; }
        public static int TotalGirls { get; set; }
    }

    class Family
    {
        private static Random _rand = new Random();
        private Task _reproducing;
        
        public int Boys { get; set; }
        public int Girls { get; set; }

        public Family()
        {
            Boys = 0;
            Girls = 0;

            _reproducing = Task.Factory.StartNew(Reproduce);
        }

        private void Reproduce()
        {
            while (Boys == 0)
                HaveBaby();
            Globals.TotalBoys += Boys;
            Globals.TotalGirls += Girls;
        }

        private void HaveBaby()
        {
            if (_rand.Next(2) == 0)
                Boys++;
            else
                Girls++;
        }
    }
}

Watching the output, it would appear to be a 50/50 split with statistically acceptable error.  Note that I'm not 100% sure how thread-safe these lines are:

Globals.TotalBoys += Boys;
Globals.TotalGirls += Girls;

So until I do some more detailed analysis of the code it's possible that some of the child counts may be lost.  But the numbers quickly become large enough that this effect also likely becomes statistically insignificant.

As I watch the numbers tick by, they seem to be neck-and-neck.  Sometimes the boys are ahead, sometimes the girls are ahead.  I'm sure I can make a better visualization of it, but I wanted to keep it as a simple console output for this post.  It looks like one gender can overtake the other by a significant margin for a significant amount of time, but if I re-start then it's just as likely that the other gender will do the same.  So testing seems to indicate that it's "close enough."

Of course, this is an over-simplification of the real-world implications.  Would a family with 4 girls continue to reproduce?  10 girls?  What if there are twins?  Does this account for the overall population of the species (mortality rates, etc.) or just the birth rate?  The question is just about the math, really.  It's not about the real-world scenario.  It is, after all, a mythical land.  I'm sure all of these families also live in perfect spheres resting on frictionless planes.

As the discussion went on, people naturally started to do the math.  After all, we're geeks.

Kevin presented a point indicating that it should be 50/50:
In 50% of the cases there will be one boy and no girls
In 25% of the cases there will be one boy and one girl
In 12.5% of the cases there will be one boy and two girls
In 6.25% of the cases there will be one boy and three girls
In 3.125% of the cases there will be one boy and three girls
Extending into infinity

Each iteration with more girls is less and less likely and offsets the 50% chance of having one boy and no girls. The case that they have ten girls before they have a boy has a probability of about 0.05%
He later continued:
Assuming that infinite couples are able to have kids forever until they have a boy, the ratio will be 1:1. Anything less than an infinite number of couples being able to keep having kids forever will necessarily result in slightly more boys than girls and there will never be more girls than boys (statistically, outliers aside). You have to get back to the idea of a fair coin toss in statistics to understand why this is so.

So let’s run through a few examples. Let's say we have 100 couples, what happens? (Using normal rounding rules)
50 couples will have one boy, no girls
25 couples will have one boy, one girl
13 couples will have one boy, two girls
6 couples will have one boy, three girls
3 couples will have one boy, four girls
2 couples will have one boy, five girls
1 couple will have one boy, six girls.

This is a total of 100 boys (always equal to the number of couples) and 97 girls or a ratio of 1:.97.

So how about 500 couples?
250 couples will have one boy, no girls
125 couples will have one boy, one girl
63 couples will have one boy, two girls
31 couples will have one boy, three girls
16 couples will have one boy, four girls
8 couples will have one boy, five girls
4 couples will have one boy, six girls
2 couples will have one boy, seven girls
1 couple will have one boy, eight girls

This is a total of 500 boys and 494 girls or a ratio of 1:.988. Closer to 1:1 but not there yet.

The key to understanding what is happening is that for every extra girl that is born in the progression, the likelihood of it happening is halved. As you stretch out into more and more couples having the chance to have more and more children the ratio will keep getting closer and closer in a straight progression but never quite getting to 1:1 until you reach out into infinity.
Eric offered another way to look at the numbers:
It might help to look at the problem from the perspective of the kids. If we were to divide theme up into groups by their birth order, 1st children, 2nd children, and so on we know that each group has a 50/50 split. These groups also form a complete partition of the set of kids, no child is in more than one group and every child is in a group. As long as we don’t have infinite kids, that is enough to show that there are equal numbers.
This all made a lot of sense.  Statistically, it's 50/50 within an acceptable margin of error.  But nobody likes margins of error :)  So Sergey offered up an interesting point on the math:
From pure statistical standpoint there will be more boys than girls because there are exactly 50% boys, while number of girls is a progression toward 50% (1/2 of 50% + ½ of 25% + ½ of 12.5%, etc…) never quite reaching it. That assumes exactly 50% probability with no variations.
All in all, pretty interesting stuff.  I look forward to more of these.  In fact, while I was typing this, Jason posted a new one today:
Let's say you're in a game show where you have the choice to pick from one of three rooms. Two of the rooms contain nothing, and the other room has $1,000,000. You're allowed to pick one room, but you're not told what is in it just yet. After you make your choice, you're now told of one of the two remaining rooms that contains nothing. You're now allowed to either keep your original choice, or switch to the room that you don't know what it contains.

Here's a couple of examples:

Example 1
R1: nothing
R2: money
R3: nothing

You pick R3. Then you're told R1 doesn't have money. You can now stick with R3, or go with R2

Example 1
R1: money
R2: nothing
R3: nothing

You pick R1. Then you're told R2 doesn't have money. You can now stick with R1, or go with R3

Do you keep your choice, or switch? The real question is, on average is it more beneficial to keep your original choice, or switch? Why or why not?
This is fun.

Wednesday, November 9, 2011

The Infinitely Configurable System

It's a subject that comes up from time to time in many places where I've worked.  Business users want something they can control, and they don't want to have to rely on developers for the business.  (Are we really that scary?)  I suppose their intentions make sense on a theoretical level.  But what about in practice?  Well, what's the difference between theory and practice?  In theory, there is no difference.  In practice, there is.

So in theory the business wants something they can configure.  They don't want to have to work with the developers for every simple little change.  But how far do they want to take that?  I've seen it be as simple as having some look-up values which business users can edit.  That's fair enough.  But many times I've seen it be something so horribly complex that the theory and the practice are just too distant from one another to ever be realized.

I'm talking about The Infinitely Configurable System.  When business users want a rules engine that they can completely configure to suit their ever-changing needs.  They want to design their forms, define how those forms interact, define their fields, define the requirements for the inputs and the format for the outputs, and so on and so on.  They want a framework that allows them to design the business.  After all, they're the business people.  Defining the business is what they do.  Why would they want to involve a developer when a framework can do the job just fine?

It's a nice dream.  But do they really know what they're asking?  (We live in an age of frameworks now more than ever before, and the subject seems to be coming up for me a lot more these days.  I don't know if that correlation implies causation.)  Does the business user, who is an expert in that particular business, know what it takes to design a framework?

First of all, let's take a look at what this means to the up-front requirements of the software.  If you want to define a rules engine, you need to know up front all of the rules that the engine is going to support.  Consider an example.  Let's say this engine comes with a form designer.  Users can create input fields, label them, arrange them, and save them to the database.  That's exactly what they wanted, so that's what they got.  They use this engine to create a couple of simple forms.

Some time later, after the system is in production, they decide that they really need that form designer to support the ability to further define how fields interact with one another.  Field B should be defined as being conditionally required depending on the value of Field A.  Or Field B should be able to change its potential input values or input type based on the value of Field A.  The list goes on and on, and gets big fast.

If these were normal forms created by a developer, then the rules could be thus codified.  The models would have the new business logic added, the UI would be updated, and so on.  The forms as the business sees them can fairly quickly be adjusted to suit the new rules.  But we, the developers, aren't working on the forms.  We're working on the framework used to create the forms.  Suddenly the effort just got a lot bigger.  There are a lot more edge cases to consider; There is a lot more testing to be done; This work is going to take a lot longer.

Had the up-front requirements been complete for what the business is going to need, this wouldn't have been such a big change.  But ever-changing needs defining an ever-changing framework are difficult to implement.  Just look at the frameworks out there today.  They have a defined scope.  They know what they do; They define what they do; They make no pretense about their purpose and their limits.  But is the business willing to accept limits based on its original definition of what its software is going to do?  Up-front requirements are never complete.  The only constant is change.  But by going the route of a framework instead of a custom application, change becomes far more costly.

Now let's take a look at what this means to the subject matter experts.  As an example, let's say we're defining this software for a health services company.  They have subject matter experts who know all there is to know about health services.  Doctors, nurses, pharmacists, care managers, etc.  These subject matter experts are prepared to define what health services software needs to do.  They have a vision of how they're going to use it to help their patients, manage their business, and in general reduce their operational costs.

But what do health services experts know about software frameworks?  Remember, we're not designing an application for health services.  We're designing an application which a non-developer can use to design an application for health services.  This software isn't about helping patients, managing a business, or reducing operational costs.  This software is about providing a means for the business to define these things.  This is a framework.  It's a rules engine.  It's a logic system, not a health services system.

Are those subject matter experts prepared to define requirements for a rules engine or a logic system?  Are they experts in software design?  Because that's what the business wants.  They want something they can use to design their software.  This is going to cause a pretty significant disconnect between the business and the developers.

The developers, in this model, don't need to know anything about the business.  After all, the entire purpose behind this was that the developers should be removable from the equation, leaving the business users to maintain the application.  There's no domain specific language here.  The business users are prepared to talk about health services, but the developers are prepared to talk about rules engines.  If they're not on the same page and not talking about the same thing, don't expect to get very good requirements out of it.  Without any knowledge of rules engines, the subject matter experts are going to define very poor requirements.  Without any knowledge of health services, the developers are going to deliver a very poor product.

Then, even after all of the requirements have been defined and the design has been approved and all the work has been finished on the application... It's still not done.  Not even close.  Is the business expecting a health services application at this point?  I sure hope not.  They wanted a framework, so they're getting a framework.  Now it's their job to define their business and create their application.  That sounds like it's going to be a pretty difficult pill to swallow.  All of the money that was spent, all of the time that was taken, and when the product is finally delivered to the business it's not ready to be used.  Now it's time for the business to design their actual software.

Finally, what will be the recourse for the business when something goes wrong?  Do they submit a bug to the developers?  Not so fast, buddy.  That all depends on what is wrong with the system.  Is it a defect in the rules engine, or is it a defect in the logical rules that the business created within the engine?

If it's a defect in the framework then that's fine.  The developers will update their tests, update the software, and deliver it back to the business with the defect fixed.  Of course, now the business needs to start adjusting their use of the framework.  The defect is fixed, but their application might not be.

But what if it's a defect in the rules defined by the business?  That's not the developers' problem.  It wasn't part of the requirements and has nothing to do with the product that the developers have built.  It's the business' problem.  The subject matter experts need to analyze their logic in their rules engine and fix it.  The business users need to debug their software.  Again, that sounds like a really tough pill to swallow.

I've been in this situation enough times to know that they won't swallow that pill.  They'll push it back to the developers to figure out the root cause of the problem and fix it.  After all, the business users aren't software developers.  It's not their job to design and maintain this stuff.  Even though the original intent of the project was for the business users to be able to design and maintain this stuff.

This leaves us in a situation where tons of effort has been put into creating a user-definable framework which isn't being defined by the users.  More cost went into creating it; More cost goes into maintaining it (because it's a lot more complicated then it needs to be); And developer morale has taken a serious hit because they're caught in the middle of this nonsense.  All of this could have been avoided by just writing a custom application that does what the business needs it to do.

Don't chase that holy grail.  The infinitely configurable system is a dream.  Let the business users stick to the business processes that they know and love, and let the developers stick to writing simple and maintainable software.

Friday, November 4, 2011

Clean Code

I've been reading Clean Code lately and have been finding it a bit inspiring.  It's interesting in that it's a fairly quick read, but at the same time it's taking me a while to get through it.  This is primarily because each new subject prompts me to think about my own work and my own development practices.  Many times I'll learn something new (well, not new per se, but a simpler way of looking at something that I always knew but never quite articulated), other times I'll find validation of something I've been trying to instill on other developers.  With each passing section I'll stop, put the book down for a moment, think about what it means, and determine my own opinion of related material.  Many times I'll find that opinion stated in the following paragraph (though far more eloquently than I've been able to as of yet).

The book is even laid out in the manner of clean code.  Short chapters sticking to specific topics, broken up into very small sections each of which maintains a discrete piece of that topic.  The chapters and sections of the book are like classes and functions in code, each maintaining its specific responsibility and elegantly fitting into the larger picture to form a complex topic.  It's a complex thing made of many simple things.

In any event, I figured I'd run through a quick exercise to apply these techniques.  To see where I stand as a clean coder, as it were.  So I jumped on a short example of some code online.  Namely, the C# implementation of the Dining Philosophers Problem at Rosetta Code.  The working implementation, in its entirety, is here:

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
 
namespace DiningPhilosophers
{
  enum DinerState { Eat, Get, Pon }
 
  class Fork                    //       Using an object for locking purposes
  {                             //                         instead of an int.
      public int holder = -1;   //    initialized without a holder (on table)
  }
 
  class Diner
  {
    private static int        instanceCount;
    public         Fork       left;
    public         Fork       right;
    public         int        id;
    public         DinerState state             = new DinerState();
    const          int        maxWaitMs         = 100;
    public         bool       end               = false;
 
    public Diner()
    {
             id     = instanceCount++;
             left   = Program.forks[id];
             right  = Program.forks[(id + 1) % Program.dinerCount];
             state  = DinerState.Get;
      Thread thread = new Thread(new ThreadStart(doStuff));
      thread.Start();             // start instance's thread to doStuff()
    }
 
    public void doStuff()
    {
      do
      {
        if (state == DinerState.Get)
        {
          bool lockedL = false;
          Monitor.TryEnter(left, ref lockedL);      //     try lock L
 
          if (lockedL)                              //  got left fork
          {
            left.holder = id;           //  left fork holder = this
            bool lockedR = false;
            Monitor.TryEnter(right, ref lockedR);     // try lock R
 
            if (lockedR)                //    lock R succeeded too.
            {                           //           got both forks
              right.holder = id;      //      list this as holder
              state = DinerState.Eat; //                     Eat.
              Thread.Sleep(Program.rand.Next(maxWaitMs));
              right.holder = -1;      // no right fork holder now
              Monitor.Exit(right);    //                 unlock R
              left.holder = -1;       //  no left fork holder now
              Monitor.Exit(left);     //                 unlock L
              state = DinerState.Pon; //                  Ponder.
              Thread.Sleep(Program.rand.Next(maxWaitMs));
            }
            else       // got left, but not right, so put left down
            {
              left.holder = -1;       //            no holder now
              System.Threading.Monitor.Exit(left);    // unlock L
              Thread.Sleep(Program.rand.Next(maxWaitMs));
            }
          }
          else           //                 could not get either fork
          {              //                                wait a bit
              Thread.Sleep(Program.rand.Next(maxWaitMs));
          }
        }
        else               //                   state == DinerState.Pon
        {                  //      done pondering, go back to get forks
          state = DinerState.Get;         //      trying to get forks
        }
      } while (!end);
    } 
  }
 
  class Program
  {
    public const  int         dinerCount = 5;
           const  int         runSeconds = 60;
    public static List<diner> diners     = new List<diner>();
    public static List<fork>  forks      = new List<fork>();
    public static Random      rand       = new Random();
 
    static void Main(string[] args)
    {
      for (int i = 0; i < dinerCount; i++) forks.Add(new Fork());
      for (int i = 0; i < dinerCount; i++) diners.Add(new Diner());
      DateTime endTime = DateTime.Now + new TimeSpan(0,0,runSeconds);
      Console.Write("|");                               //   write header
      foreach (Diner d in diners) Console.Write("D " + d.id + "|");
      Console.Write("    |");
      for (int i = 0; i < dinerCount; i++) Console.Write("F" + i + "|");
      Console.WriteLine();
 
      do
      {                                                 // display status
        Console.Write("|");
        foreach (Diner d in diners) Console.Write(d.state + "|");
        Console.Write("    |");
        foreach (Fork f in forks) Console.Write(
            (f.holder < 0 ? "  " : "D" + f.holder) + "|");
        Console.WriteLine();
        Thread.Sleep(1000);                           //   milliseconds
      } while (DateTime.Now < endTime);
 
      foreach (Diner d in diners) d.end = true;         // signal to quit
    }
  }
}

It effectively presents a solution to the given problem, and it does so in a reasonably small web-display-friendly manner.  But it seems kind of... messy.  Let's take a look at the problems found here:
  1. Functions are long.  Too long.
  2. Too much nesting.  Low-level concerns are polluting higher-level concerns.
  3. The comments are formatted any which way, and are small enough and obvious enough that they shouldn't even be needed if the code could just be a little more expressive.
  4. Function names and variable names are ineffective.  doStuff()?  Really?
  5. Classes inter-depend in odd ways.  Lots of tight coupling.
  6. Classes and functions have multiple responsibilities.
There may be more, but that's certainly enough to merit a good cleaning.  So I set about refactoring, but with a specific goal in mind... The resulting output should exactly match the current output.  This is often a very important goal in refactoring code, keeping the efforts transparent to the end user.  After all, the program does what it's supposed to do.  From a business perspective, it shouldn't change.  But from a code perspective, it certainly needs to be cleaned.

It took a couple hours of my time, but the process was actually a lot of fun.  I intentionally didn't document each step along the way, though.  I'd like to, and I will.  But first I need to verify my work.  I intend to do this by setting it aside for a while and then going back to the original and attempting the process again.  Then I'll compare my two results and see how consistent I was in the process.  Depending on the results of that, I may attempt it a third time.

Eventually, I'll be happy with the results and will step through it in a more documented and tracked way, isolating individual improvements to the code and describe what they mean to the overall process.  This will end up in the form of a slide deck and several code samples, perhaps also a screen cast, which I intend to share here at a later time.  But for now at least I can present my initial output of refactoring and cleaning the code.  Let's take a look at each class...

using System;
using System.Threading;

namespace DiningPhilosophers
{
  class Program
  {
    private const int PHILOSOPHER_COUNT = 5;
    private const int SECONDS_TO_RUN = 60;

    private static Random rand;
    private static DateTime endTime;

    static void Main(string[] args)
    {
      InitializeObjects();
      DisplayFormatter.WriteOutputHeader();
      do
      {
        DisplayFormatter.WriteStatus();
        Thread.Sleep(1000);
      } while (DateTime.Now < endTime);
      Terminate();
    }

    private static void InitializeObjects()
    {
      rand = new Random();

      for (var i = 0; i < PHILOSOPHER_COUNT; i++)
        Fork.KnownForks.Add(new Fork());
            
      for (var i = 0; i < PHILOSOPHER_COUNT; i++)
        Philosopher.KnownPhilosophers.Add(
            new Philosopher(
                id: i,
                randomizer: rand,
                leftFork: Fork.KnownForks[i],
                rightFork: Fork.KnownForks[(i + 1) % PHILOSOPHER_COUNT]));
            
      endTime = DateTime.Now.AddSeconds(SECONDS_TO_RUN);
    }

    private static void Terminate()
    {
      foreach (var philosopher in Philosopher.KnownPhilosophers)
        philosopher.Dispose();
    }
  }
}

The Program class is just maintaining the program.  It initializes the global data, determines when to write output elements, and ends the program.  Simple.  Part of cleaning it up included moving the formatting of the output to a separate class to maintain that responsibility...

using System;

namespace DiningPhilosophers
{
  public static class DisplayFormatter
  {
    private const string PHILOSOPHER = "D";
    private const string FORK = "F";
    private const string ITEM_BORDER = "|";
    private const string LIST_DIVIDER = "    ";

    private const string PHILOSOPHER_HEADER_FORMAT = "{0} {1}";
    private const string FORK_HEADER_FORMAT = "{0}{1}";
    private const string PHILOSOPHER_ITEM_FORMAT = "{0}";
    private const string FORK_ITEM_FORMAT = "{0}{1}";

    public static void WriteOutputHeader()
    {
      Console.Write(ITEM_BORDER);

      foreach (var philosopher in Philosopher.KnownPhilosophers)
      {
        Console.Write(string.Format(PHILOSOPHER_HEADER_FORMAT, PHILOSOPHER, philosopher.ID));
        Console.Write(ITEM_BORDER);
      }

      Console.Write(LIST_DIVIDER + ITEM_BORDER);

      for (var i = 0; i < Philosopher.KnownPhilosophers.Count; i++)
      {
        Console.Write(string.Format(FORK_HEADER_FORMAT, FORK, i));
        Console.Write(ITEM_BORDER);
      }

      Console.WriteLine();
    }

    public static void WriteStatus()
    {
      Console.Write(ITEM_BORDER);

      foreach (var philosopher in Philosopher.KnownPhilosophers)
        WritePhilospherStatus(philosopher);

      Console.Write(LIST_DIVIDER + ITEM_BORDER);

      foreach (var fork in Fork.KnownForks)
        WriteForkStatus(fork);

      Console.WriteLine();
    }

    private static void WritePhilospherStatus(Philosopher philosopher)
    {
      var state = FormatDinerStateForDisplay(philosopher.State);
      Console.Write(string.Format(PHILOSOPHER_ITEM_FORMAT, state));
      Console.Write(ITEM_BORDER);
    }

    private static void WriteForkStatus(Fork fork)
    {
      var state = fork.IsHeld ?
                  string.Format(FORK_ITEM_FORMAT, PHILOSOPHER, fork.Holder) :
                  string.Format(FORK_ITEM_FORMAT, " ", " ");
      Console.Write(state);
      Console.Write(ITEM_BORDER);
    }

    private static string FormatDinerStateForDisplay(Philosopher.DinerState state)
    {
      switch (state)
      {
        case Philosopher.DinerState.Eating:
          return "Eat";
        case Philosopher.DinerState.GettingForks:
          return "Get";
        case Philosopher.DinerState.Pondering:
          return "Pon";
      }
      return string.Empty;
    }
  }
}

The DisplayFormatter class is responsible for maintaining the output format.  It primarily just writes the output header and each output line, following formatting specifications in its internal constants.  This and the previous Program class make up the application part of the whole system.  We also have two models within the system; Forks and Philosophers...

using System.Collections.Generic;

namespace DiningPhilosophers
{
  public class Fork
  {
    private static List<fork> _knownForks;
    public static IList<fork> KnownForks
    {
      get
      {
        if (_knownForks == null)
          _knownForks = new List<fork>();
        return _knownForks;
      }
    }

    private const int NOT_HELD = -1;

    public int Holder { get; private set; }

    public bool IsHeld
    {
      get
      {
        return (Holder != NOT_HELD);
      }
    }

    public Fork()
    {
      Holder = NOT_HELD;
    }

    public void TakeFork(int holder)
    {
      Holder = holder;
    }

    public void DropFork()
    {
      Holder = NOT_HELD;
    }
  }
}

The fork is simple enough (as it should be, it's not a complex real-world concept).  It maintains the state of the fork, providing information on who is holding it as well as methods to pick it up and drop it.

using System;
using System.Collections.Generic;
using System.Threading;

namespace DiningPhilosophers
{
  public class Philosopher : IDisposable
  {
    private static List<philosopher> _knownPhilosophers;
    public static IList<philosopher> KnownPhilosophers
    {
      get
      {
        if (_knownPhilosophers == null)
          _knownPhilosophers = new List<philosopher>();
        return _knownPhilosophers;
      }
    }

    public int ID { get; private set; }

    public DinerState State { get; private set; }

    private const int MAX_WAIT_TIME = 100;

    private Fork _leftFork;
    private Fork _rightFork;
    private bool _currentlyHoldingLeftFork;
    private bool _currentlyHoldingRightFork;
    private bool _doneEating;
    private Random _randomizer;

    private Philosopher()
    {
      SetDefaultValues();
      Dine();
    }

    public Philosopher(int id, Random randomizer, Fork leftFork, Fork rightFork) : this()
    {
      ID = id;
      _randomizer = randomizer;
      _leftFork = leftFork;
      _rightFork = rightFork;
    }

    private void SetDefaultValues()
    {
      _currentlyHoldingLeftFork = false;
      _currentlyHoldingRightFork = false;
      _doneEating = false;
      State = DinerState.GettingForks;
    }

    private void Dine()
    {
      var thread = new Thread(new ThreadStart(TryToEat));
      thread.Start();
    }

    private void TryToEat()
    {
      do
      {
        if (State == DinerState.GettingForks)
        {
          GetLeftFork();

          if (_currentlyHoldingLeftFork)
          {
            GetRightFork();

            if (_currentlyHoldingRightFork)
            {
              Eat();
              DropRightFork();
              DropLeftFork();
              Ponder();
            }
            else
            {
              DropLeftFork();
              Wait();
            }
          }
          else
          {
            Wait();
          }
        }
        else
        {
          State = DinerState.GettingForks;
        }
      } while (!_doneEating);
    }

    public void Dispose()
    {
      _doneEating = true;
    }

    private void GetLeftFork()
    {
      Monitor.TryEnter(_leftFork, ref _currentlyHoldingLeftFork);
      if (_currentlyHoldingLeftFork)
        _leftFork.TakeFork(ID);
    }

    private void GetRightFork()
    {
      Monitor.TryEnter(_rightFork, ref _currentlyHoldingRightFork);
      if (_currentlyHoldingRightFork)
        _rightFork.TakeFork(ID);
    }

    private void DropLeftFork()
    {
      _leftFork.DropFork();
      _currentlyHoldingLeftFork = false;
      System.Threading.Monitor.Exit(_leftFork);
    }

    private void DropRightFork()
    {
      _rightFork.DropFork();
      _currentlyHoldingRightFork = false;
      Monitor.Exit(_rightFork);
    }

    private void Eat()
    {
      State = DinerState.Eating;
      Wait();
    }

    private void Ponder()
    {
      State = DinerState.Pondering;
      Wait();
    }

    private void Wait()
    {
      Thread.Sleep(_randomizer.Next(MAX_WAIT_TIME));
    }

    public enum DinerState
    {
      Eating,
      GettingForks,
      Pondering
    }
  }
}

The Philosopher is where the bulk of the algorithm logic resides, naturally.  After all, it's the philosophers themselves who are trying to eat their meals.  So the decision-making process of what to do with forks and food exists in here.

This implementation probably isn't as clean as it could be.  (But then, what is?)  Looking at it now I see a few things I don't entirely like and may want to refactor:
  1. The Fork and Philosopher classes also contain their globals.  Those should probably be moved out into a class which manages the globals and nothing else.
  2. The TryToEat() method still has an uncomfortable amount of nesting.  It's the main algorithm of the whole thing, so one can expect it to be the most complex unit in the implementation.  But I'd like to re-think how I can write this.  It currently follows the "happy path" pattern of nesting, and I'd like to find a way to invert that.
  3. Forks know who their holders are.  That doesn't really model the real-life concept of a fork.  A fork doesn't know who is holding it; It doesn't know anything.  In a real-world scenario, the philosophers would instead know who their neighbors are and be able to perceive by looking at their neighbors whether their forks are in use.  So I'd like to move fork state management out of the Fork class and into the Philosopher class.
  4. Should the Program class initialize global state of the application?  Or should that be moved out into an initializer class of some sort?  I didn't think of it while writing the code, but while describing the code any time I find myself using the word "and" in the description of a class' responsibilities I feel like I may be violating Single Responsibility.
Again, I'm sure there's more.  And I welcome the input on what that may be.

It's interesting that the simple act of writing about the code has caused me to critically look at the code even further than when I was writing it.  At the start of this post I had figured that this would be a good initial implementation, and now I'm not so sure.

But then this is the nature of any creative work, is it not?  I said to a colleague yesterday that an ongoing theme in my career for the past couple of years has been that any code I wrote at least three months ago is crap.  This means that anything I write today will be crap three months from now.  This is a good problem to have.  It means that my cycle of learning new things and improving my skills is fairly short and constantly moving.  Now I'm finding that code I wrote yesterday isn't up to par.  Again, it's a good problem to have.

I'm still going to continue with the plan I'd laid out earlier in this post.  But it seems now that I have some more refactoring to do before I start on the second attempt.  It seems also that I may be refining this for a while before I find it to be presentation-ready.  Of course, I also run the risk of an endless loop whereby the time it takes to create the presentation and the act of creating it combine to form a result of not being happy with what I'm presenting.  So this will also be an opportunity to find that professional "good enough" sweet spot.  At the very least, this will be good practice on several levels.

Thursday, November 3, 2011

Validation Shouldn't Prevent Saving

I know, I've covered this, right?  But Martin Fowler just tweeted a link to something he wrote a while back and I think it adds some important context to the overall idea.  Specifically, the last paragraph of what he wrote:
"In About Face Alan Cooper advocated that we shouldn't let our ideas of valid states prevent a user from entering (and saving) incomplete information. I was reminded by this a few days ago when reading a draft of a book that Jimmy Nilsson is working on. He stated a principle that you should always be able to save an object, even if it has errors in it. While I'm not convinced that this should be an absolute rule, I do think people tend to prevent saving more than they ought. Thinking about the context for validation may help prevent that."
I'm reminded of a concept they had at a previous job of mine.  Their overall implementation wasn't good at all, but the business goal to which they were striving was certainly a good one.  Data should always be saved.  The idea was that validity of an object depended heavily on state.  The system they had was mainly data entry.  There was some workflow to move records from group to group, department to department, all depending on state.

But in any given department there were very few input restrictions.  Users could always save the data, even if it didn't make a whole lot of sense.  Validation occurred in the workflow when the data was to be presented to another group/department.  At that point, it needed to be valid for the target users.  That is, the person who ships it off to another group needs to have completed everything for which they are responsible.  But when the data was within a single point on the entire workflow, it was very free-form and transient.  It was a scratch pad, able to be saved and returned to at a later time.

FogBugs takes a very similar approach to user input as well.  They don't put a lot of restrictions on user input.  If a user has noticed a bug, it needs to be entered into the system.  Maybe they can't accurately describe the steps to reproduce right now, maybe they don't entirely know what "module" they're in.  Those are details, and they can be added later.  But there are no barriers to simply starting a ticket.  Whatever information the user has right now is good enough.  Let them save.  It may not be "valid" enough for a developer to receive it as a work item, but it's always "valid" enough to be recorded in the system.

Users don't like barriers.  They end up either working around them or giving up.  In the former case, you have unknown things happening in your system.  In the latter case, you're losing data.  Neither of which are very appealing outcomes.  Reducing those barriers is important.  It makes things a little more complex in the sense that pure rigidity of validation isn't quite so black and white, but state-driven validation isn't difficult to implement either.

So yes, my previous model was overly-rigid.  It served to demonstrate a point, that's all.  But adding state-driven validation within the object is just as easy, really.  Objects can have statuses, methods to move them from one status to another, etc.  Consider this model:

public class Band
{
  private string _name;
  public string Name
  {
    get
    {
      return _name;
    }
    set
    {
      // Perform state-driven validation
      _name = value;
    }
  }

  private string _genre;
  public string Genre
  {
    get
    {
      return _genre;
    }
    set
    {
      // Perform state-driven validation
      _genre = value;
    }
  }

  private Status _currentStatus;
  public Status CurrentStatus
  {
    get
    {
      return _currentStatus;
    }
  }

  private Band()
  {
    // Set the default status
  }

  public Band(string name, string genre) : this()
  {
    Name = name;
    Genre = genre;
  }

  public void MoveToNextStatus()
  {
    // Based on various states and the current status,
    // validate the object with custom logic and update
    // the _currentStatus to the next business state.
  }
}

As with any domain model, it's just business logic.  There is a business concept of a scratch pad and the validation of the model implements that concept.  More rigorous validation is then performed when an action is performed on the model, such as moving it to the next business status.  There could be various private methods to handle this internal validation, and the property setters just do some basic validation.  For example, maybe one department can't set certain fields.  Then the setters would check to see if the current status is in that department and throw an exception accordingly.

It sounds like it's a little more complicated, but honestly it makes things a lot easier from a business perspective.  Remember, writing the software is the easy part.  Getting the business to actually define what it wants to validate the software against that definition... That's the hard part.  This kind of approach, in my experience, makes it a hell of a lot easier for the business.  Many times I've sat down with a business trying to define and model their concepts and objects, and when I ask deceptively simple questions such as "What fields are required?" we invariably end up in multi-hour discussions with several other business users and subject matter experts to try to find a balance between everyone's interpretation of that business concept.  And ultimately that one definition of what fields are required ends up causing a headache for another user somewhere.

Driving the validation by the workflow is just a more natural approach for many businesses.  And supporting that in the software makes for a more natural implementation of the business.

New England Code Camp 16

This past Saturday was another local software conference.  (I'll admit, I'm hooked on them.)  This time I headed back over to the Microsoft office in Waltham for New England Code Camp 16.  (Side note: I overheard someone saying that Microsoft will be pulling out of the Waltham location and running everything in the area from their NERD Center in Cambridge and another campus they have nearby.)

All in all a good event, as always.  This time my employer was a sponsor and we had more of a presence there, which is cool.  I hope we can step it up and have a proper table with swag and recruiting and all that next time.  One of our team members was also giving a presentation so it was good for us to show support.  I didn't end up going to his, though, because there was other interesting stuff for me to attend and learn something.

So, as per my usual style here, let's take a look at the sessions I attended...

9:10 - "Easy Async With .NET 4.5" with John Bowen

Not bad.  The guy definitely needs to work on his public speaking.  (And his public shutting up... He'd go off on tangents and just get lost for a while.)  He clearly knew the material, but had a bit of trouble explaining it.  Cool stuff though, and I'm seriously looking forward to using async and await on a regular basis.  If I could give this guy one suggestion it would be to have more concrete examples.  After all, this is Code Camp.  A slide deck doesn't cut it here :)

10:30 - "ASP.NET MVC 3 Introduction, Part 1" with Brock Allen from DevelopMentor

Very engaging speaker, I liked this guy a lot.  The material in this part of the presentation was mostly review for me, but there were certainly little nooks and crannies of the material that were new to me.  And it's always good to re-sync with others on material one already knows just to see how well my understanding of it jives with the rest of the industry.

My favorite part was that I very much agree with how this guy develops.  At one point someone asked him about some of the widgets from WebForms and what their counterparts were here and he tried to explain that there aren't any.  Things like the login control and registration wizard and stuff like that.  He said things like, "It's really not hard to just write a couple of views and controller actions, so why not do that?  Sure, you could look for some widget that does all the work for you, but you lose the more fine-grained control over what you're doing.  And one of the biggest benefit of MVC views is that you have that much more control over the resulting markup."  It's always nice to see an industry partner who thinks like I do, because all too often I'm immersed in the Kool-Aid that I refuse to drink.

One interesting thing happened during this presentation.  Someone started asking about ViewState, and the presenter explained that it's not there anymore.  This apparently shocked and dismayed people.  (That fact, in tern, somewhat dismayed me.)  He explained that you can still use "ViewState" in the sense that you can still serialize a dictionary, base-64 encode it, and store it in a hidden field.  The fact that many people in the audience didn't understand what he meant was also a little disconcerting.

But then a frightening question was asked... "Without ViewState, where should I store sensitive information like Social Security Numbers?"  My hand shot up in the air to respond.  The presenter handled the question well, explaining that sensitive information should be avoided and statefulness should be avoided and all that good stuff.  But I still had a point that needed to be driven into the asker.  When I was called upon, I tried to explain to the asker that ViewState is completely open and readable by the user and should never be used to store any sensitive information.  Ever.  But I don't think he understood.  Sometimes I'm ashamed of my industry, I really am.

11:50 - "ASP.NET MVC 3 Introduction, Part 2" (same presenter)

This is where things got a lot more involved with the stuff specific to version 3.  There was a lot to take from this.  (Both talks were very fast-paced and involved lots of jumping around in Visual Studio.  The guy said that these are usually longer talks that he had to condense.)  It was mostly about the various tooling that's available to help with things, including Data Annotations :)

You know me, I have sort of a love-hate relationship with frameworks and tooling.  But this guy kept it together really well even for me.  Use the tooling where it's appropriate, don't rely on it where it's dangerous.  In fact, he even made it a point to use Data Annotations on his input models and not his domain models.  He had the domain models in a separate project and in his MVC project just had view models and input models.  (I bet he's a fan of FubuMVC.)

Honestly, what I liked most about this presenter was his approach to coding.  Clean code all the way.  He never left anything messy or unfinished.  No placeholders, no TODO statements, none of that.  He kept it clean and concise and to the point.  If something worked but needed to be cleaned up a little, he took a moment to clean it up.  Much respect, Brock.  Much respect.

1:00 - Lunch (pizza), provided by Telerik

1:30 - "Objective-C for C# and Other .NET Developers" with Chris Pels

I'm definitely interested in the material, and clearly so is the speaker.  But the presentation itself was kind of a bust.  His screen saver kept triggering, he kept having to fiddle with the MacOS environment to get things to work, etc.  Basically, he didn't seem to know his way around the presentation tools.  That kind of thing will kill a presentation, man.  If you have to stop presenting for a few minutes in order to fix something, you lose the audience.

I think the focus of the talk was lost.  He went into some Objective-C and general iOS development in XCode, but he didn't really tailor it to the audience.  We were expecting something from the perspective of people who live and breathe within Visual Studio.  This wasn't it.  He'd occasionally try to tie things back to C# jargon, but it was kind of forced and ineffective.

And he didn't handle questions well.  As you'd expect, some people were asking about how to leverage the .NET code they already have and how much of it would need to be re-written for iOS devices.  He couldn't really say much more than "look into Mono."  Something more apt may have been along the lines of...

"Well, that depends a lot on how the existing .NET code is architected.  In an ideal situation, and if you can make use of MonoTouch, then you should only have to re-write the UI portions.  The views still need to be created in XCode just like any other native iOS development, but all of the code that drives them (models and controllers, mainly) can be in .NET through MonoTouch.  Just as .NET is a framework layer sitting on top of the Win32 API (and now the WinRT API as well), MonoTouch is a .NET framework layer sitting on top of the iOS API.  Just like there are some differences between Win32 and WinRT, expect that there are also differences between Win32 and iOS.  Some things are available in one but not the other.  So you may have to adjust some of your code to be truly cross-platform, maybe abstract out some of the platform-specific stuff into services that you can swap between them.  But as long as the code is separated out away from the views, then you should be able to leverage much of what you have."

Was that so hard?  I just made that up on the spot.  I know, public speaking is hard.  But you've got to stay focused and give clear and engaging responses or you're going to lose the audience.  I doubt anybody in that room walked away excited about doing iOS development as a .NET developer.  Much of the murmuring seemed to indicate that they just won't bother and will wait until Microsoft saves the day with its own tablet.

2:50 - "Introduction to Windows Identity Foundation" with Brock Allen (same presenter as the MVC stuff earlier)

Definitely comprehensive, and as you know I like this speaker.  The material was a little tough to follow, though.  I haven't done a lot of authentication stuff myself, so he was moving across subject matter that was a little foreign to me in some cases.  But he kept it within reach and I'm definitely continuing to learn about it.

What I'd really like to see is a more complete end-to-end implementation, like he had with the MVC 3 talks he gave earlier.  This one involved more slides and less code.  He did tie it together very well at the end, which was great.  As before, very fast-paced and a lot to cover.  I'm sure that someone equally as familiar with the material as I was with MVC enjoyed this talk as much as I did the earlier one.

The juiciest nugget was when he told us to go to http://www.leastprivilege.com/ to learn more, and claimed that there's a free eBook on Windows Identity Foundation there somewhere which he highly recommends.  I haven't found it yet, but I'll look again later.

4:10 - I took a break during this session.  My eyes hurt from projectors and monitors.  I'm getting old.

5:30 - ".NET and MongoDB: A Code First Introduction to NoSQL" with John Zablocki

By this time of the evening it was very dark outside and the snow was really coming down, so a lot of people had left already and those of us still there were finding ourselves in a much more relaxed atmosphere.  So I think that contributed to the fact that this talk didn't stay on topic very effectively.

It was a great introduction to MongoDB, and covered some stuff I didn't already know about the database, which was great.  But we spent way too much time mucking around in the Mongo shell and hardly even looked at any .NET code.  Seriously, what's "code first" about mucking around in the database and then writing a little code to use it?

The guy's got NoSQL skills, no doubt about that.  But this audience isn't looking for a young ninja to show them something cool.  These are mostly Microsoft Kool-Aid drinking developers in their 30s and 40s who want to know how this can help them with the work they already do.  It's an older crowd and teaching them new tricks requires a little understanding of how they currently do things.  (And I'm no exception to that, though I like to think I'm further away from them on the overall scale.)

Basically, the guy got really side-tracked by his intro into MongoDB and barely even touched the agenda he set forth for the presentation.  There was almost no .NET code.

In Conclusion

Looking back at what I've written, I sound very critical of the presenters.  I guess that's just how I approach these things.  But don't get the wrong message.  This event and other events like it are fantastic.  I learned a lot, I networked with colleagues, I advanced my career ever so slightly.

In fact, I'm inspired to put together some presentations of my own.  I'd been toying with the idea, but couldn't think of something that would be worth everyone else's time.  The problem, as it turns out, was that I was trying to think of talks that I could give for my colleagues.  But that's just it... I took this job because my new colleagues are really good at this stuff.  I'm here to learn from them.  And I love it.

Sitting in the room during the MVC 3 talks cleared my head a little bit.  Not everyone is surrounded by the developers who surround me.  There are a lot of developers out there who just do the same stuff day in and day out.  They don't take for granted all of the things I do.  I can give a talk that introduces other developers to the concepts I take for granted in my development, and there's an audience for that.

I'm thinking I'd like to start with a simple introduction into dependency injection.  It's only an hour of time, I can certainly fill that.  I've come up with a basic agenda and have thrown together some code samples that I can walk through (or build on the fly, to engage the audience a little more).  I just need to remember to keep focus on the material.  Don't get lost on a tangent, don't focus on everything else being perfectly designed, etc.  Just put together a simple application and demonstrate the concepts of dependency injection.  That's all.

Another idea could be to take a look at MongoDB again and how it can be used in a .NET application.  Focus less on the internals of the database engine, there's plenty of material out there for that.  Just use it in a simple application to demonstrate how it would be used.

Any other ideas?  Man, I can't wait until the next Code Camp.