Friday, November 18, 2011

Math, Boys and Girls

Our CTO posted an interesting question on our Yammer feed yesterday:
In a mythical land, parents keep having children until a boy is born, and then they stop.
Without doing the math, does this result in more girls than boys on average, or more boys than girls?
Doing the math?  Man, I haven't "done the math" since school.  And there's a lot that I didn't retain, so I can't say for certain if I ever did this.  But one thing's for sure... I never approached math from this angle back in school.  You remember how it was, right?  There's a textbook, there are problems, you solve them, you're done.  Nothing really thought-provoking like this, at least for us non-math-majors.

So I couldn't really think of an equation or anything to approximate this.  Looking back at some of the responses now, it seems kind of obvious.  It's just not how I've approached problems in the past, so it's not how I think.  No, how I think is to write a program to simulate it:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;

namespace RockysQuestion
{
    class Program
    {
        static void Main(string[] args)
        {
            while (true)
            {
                new Family();
                Console.WriteLine(
                    string.Format(
                        "Total Boys: {0} | Total Girls: {1}",
                        Globals.TotalBoys,
                        Globals.TotalGirls));
            }
        }
    }

    static class Globals
    {
        public static int TotalBoys { get; set; }
        public static int TotalGirls { get; set; }
    }

    class Family
    {
        private static Random _rand = new Random();
        private Task _reproducing;
        
        public int Boys { get; set; }
        public int Girls { get; set; }

        public Family()
        {
            Boys = 0;
            Girls = 0;

            _reproducing = Task.Factory.StartNew(Reproduce);
        }

        private void Reproduce()
        {
            while (Boys == 0)
                HaveBaby();
            Globals.TotalBoys += Boys;
            Globals.TotalGirls += Girls;
        }

        private void HaveBaby()
        {
            if (_rand.Next(2) == 0)
                Boys++;
            else
                Girls++;
        }
    }
}

Watching the output, it would appear to be a 50/50 split with statistically acceptable error.  Note that I'm not 100% sure how thread-safe these lines are:

Globals.TotalBoys += Boys;
Globals.TotalGirls += Girls;

So until I do some more detailed analysis of the code it's possible that some of the child counts may be lost.  But the numbers quickly become large enough that this effect also likely becomes statistically insignificant.

As I watch the numbers tick by, they seem to be neck-and-neck.  Sometimes the boys are ahead, sometimes the girls are ahead.  I'm sure I can make a better visualization of it, but I wanted to keep it as a simple console output for this post.  It looks like one gender can overtake the other by a significant margin for a significant amount of time, but if I re-start then it's just as likely that the other gender will do the same.  So testing seems to indicate that it's "close enough."

Of course, this is an over-simplification of the real-world implications.  Would a family with 4 girls continue to reproduce?  10 girls?  What if there are twins?  Does this account for the overall population of the species (mortality rates, etc.) or just the birth rate?  The question is just about the math, really.  It's not about the real-world scenario.  It is, after all, a mythical land.  I'm sure all of these families also live in perfect spheres resting on frictionless planes.

As the discussion went on, people naturally started to do the math.  After all, we're geeks.

Kevin presented a point indicating that it should be 50/50:
In 50% of the cases there will be one boy and no girls
In 25% of the cases there will be one boy and one girl
In 12.5% of the cases there will be one boy and two girls
In 6.25% of the cases there will be one boy and three girls
In 3.125% of the cases there will be one boy and three girls
Extending into infinity

Each iteration with more girls is less and less likely and offsets the 50% chance of having one boy and no girls. The case that they have ten girls before they have a boy has a probability of about 0.05%
He later continued:
Assuming that infinite couples are able to have kids forever until they have a boy, the ratio will be 1:1. Anything less than an infinite number of couples being able to keep having kids forever will necessarily result in slightly more boys than girls and there will never be more girls than boys (statistically, outliers aside). You have to get back to the idea of a fair coin toss in statistics to understand why this is so.

So let’s run through a few examples. Let's say we have 100 couples, what happens? (Using normal rounding rules)
50 couples will have one boy, no girls
25 couples will have one boy, one girl
13 couples will have one boy, two girls
6 couples will have one boy, three girls
3 couples will have one boy, four girls
2 couples will have one boy, five girls
1 couple will have one boy, six girls.

This is a total of 100 boys (always equal to the number of couples) and 97 girls or a ratio of 1:.97.

So how about 500 couples?
250 couples will have one boy, no girls
125 couples will have one boy, one girl
63 couples will have one boy, two girls
31 couples will have one boy, three girls
16 couples will have one boy, four girls
8 couples will have one boy, five girls
4 couples will have one boy, six girls
2 couples will have one boy, seven girls
1 couple will have one boy, eight girls

This is a total of 500 boys and 494 girls or a ratio of 1:.988. Closer to 1:1 but not there yet.

The key to understanding what is happening is that for every extra girl that is born in the progression, the likelihood of it happening is halved. As you stretch out into more and more couples having the chance to have more and more children the ratio will keep getting closer and closer in a straight progression but never quite getting to 1:1 until you reach out into infinity.
Eric offered another way to look at the numbers:
It might help to look at the problem from the perspective of the kids. If we were to divide theme up into groups by their birth order, 1st children, 2nd children, and so on we know that each group has a 50/50 split. These groups also form a complete partition of the set of kids, no child is in more than one group and every child is in a group. As long as we don’t have infinite kids, that is enough to show that there are equal numbers.
This all made a lot of sense.  Statistically, it's 50/50 within an acceptable margin of error.  But nobody likes margins of error :)  So Sergey offered up an interesting point on the math:
From pure statistical standpoint there will be more boys than girls because there are exactly 50% boys, while number of girls is a progression toward 50% (1/2 of 50% + ½ of 25% + ½ of 12.5%, etc…) never quite reaching it. That assumes exactly 50% probability with no variations.
All in all, pretty interesting stuff.  I look forward to more of these.  In fact, while I was typing this, Jason posted a new one today:
Let's say you're in a game show where you have the choice to pick from one of three rooms. Two of the rooms contain nothing, and the other room has $1,000,000. You're allowed to pick one room, but you're not told what is in it just yet. After you make your choice, you're now told of one of the two remaining rooms that contains nothing. You're now allowed to either keep your original choice, or switch to the room that you don't know what it contains.

Here's a couple of examples:

Example 1
R1: nothing
R2: money
R3: nothing

You pick R3. Then you're told R1 doesn't have money. You can now stick with R3, or go with R2

Example 1
R1: money
R2: nothing
R3: nothing

You pick R1. Then you're told R2 doesn't have money. You can now stick with R1, or go with R3

Do you keep your choice, or switch? The real question is, on average is it more beneficial to keep your original choice, or switch? Why or why not?
This is fun.

Wednesday, November 9, 2011

The Infinitely Configurable System

It's a subject that comes up from time to time in many places where I've worked.  Business users want something they can control, and they don't want to have to rely on developers for the business.  (Are we really that scary?)  I suppose their intentions make sense on a theoretical level.  But what about in practice?  Well, what's the difference between theory and practice?  In theory, there is no difference.  In practice, there is.

So in theory the business wants something they can configure.  They don't want to have to work with the developers for every simple little change.  But how far do they want to take that?  I've seen it be as simple as having some look-up values which business users can edit.  That's fair enough.  But many times I've seen it be something so horribly complex that the theory and the practice are just too distant from one another to ever be realized.

I'm talking about The Infinitely Configurable System.  When business users want a rules engine that they can completely configure to suit their ever-changing needs.  They want to design their forms, define how those forms interact, define their fields, define the requirements for the inputs and the format for the outputs, and so on and so on.  They want a framework that allows them to design the business.  After all, they're the business people.  Defining the business is what they do.  Why would they want to involve a developer when a framework can do the job just fine?

It's a nice dream.  But do they really know what they're asking?  (We live in an age of frameworks now more than ever before, and the subject seems to be coming up for me a lot more these days.  I don't know if that correlation implies causation.)  Does the business user, who is an expert in that particular business, know what it takes to design a framework?

First of all, let's take a look at what this means to the up-front requirements of the software.  If you want to define a rules engine, you need to know up front all of the rules that the engine is going to support.  Consider an example.  Let's say this engine comes with a form designer.  Users can create input fields, label them, arrange them, and save them to the database.  That's exactly what they wanted, so that's what they got.  They use this engine to create a couple of simple forms.

Some time later, after the system is in production, they decide that they really need that form designer to support the ability to further define how fields interact with one another.  Field B should be defined as being conditionally required depending on the value of Field A.  Or Field B should be able to change its potential input values or input type based on the value of Field A.  The list goes on and on, and gets big fast.

If these were normal forms created by a developer, then the rules could be thus codified.  The models would have the new business logic added, the UI would be updated, and so on.  The forms as the business sees them can fairly quickly be adjusted to suit the new rules.  But we, the developers, aren't working on the forms.  We're working on the framework used to create the forms.  Suddenly the effort just got a lot bigger.  There are a lot more edge cases to consider; There is a lot more testing to be done; This work is going to take a lot longer.

Had the up-front requirements been complete for what the business is going to need, this wouldn't have been such a big change.  But ever-changing needs defining an ever-changing framework are difficult to implement.  Just look at the frameworks out there today.  They have a defined scope.  They know what they do; They define what they do; They make no pretense about their purpose and their limits.  But is the business willing to accept limits based on its original definition of what its software is going to do?  Up-front requirements are never complete.  The only constant is change.  But by going the route of a framework instead of a custom application, change becomes far more costly.

Now let's take a look at what this means to the subject matter experts.  As an example, let's say we're defining this software for a health services company.  They have subject matter experts who know all there is to know about health services.  Doctors, nurses, pharmacists, care managers, etc.  These subject matter experts are prepared to define what health services software needs to do.  They have a vision of how they're going to use it to help their patients, manage their business, and in general reduce their operational costs.

But what do health services experts know about software frameworks?  Remember, we're not designing an application for health services.  We're designing an application which a non-developer can use to design an application for health services.  This software isn't about helping patients, managing a business, or reducing operational costs.  This software is about providing a means for the business to define these things.  This is a framework.  It's a rules engine.  It's a logic system, not a health services system.

Are those subject matter experts prepared to define requirements for a rules engine or a logic system?  Are they experts in software design?  Because that's what the business wants.  They want something they can use to design their software.  This is going to cause a pretty significant disconnect between the business and the developers.

The developers, in this model, don't need to know anything about the business.  After all, the entire purpose behind this was that the developers should be removable from the equation, leaving the business users to maintain the application.  There's no domain specific language here.  The business users are prepared to talk about health services, but the developers are prepared to talk about rules engines.  If they're not on the same page and not talking about the same thing, don't expect to get very good requirements out of it.  Without any knowledge of rules engines, the subject matter experts are going to define very poor requirements.  Without any knowledge of health services, the developers are going to deliver a very poor product.

Then, even after all of the requirements have been defined and the design has been approved and all the work has been finished on the application... It's still not done.  Not even close.  Is the business expecting a health services application at this point?  I sure hope not.  They wanted a framework, so they're getting a framework.  Now it's their job to define their business and create their application.  That sounds like it's going to be a pretty difficult pill to swallow.  All of the money that was spent, all of the time that was taken, and when the product is finally delivered to the business it's not ready to be used.  Now it's time for the business to design their actual software.

Finally, what will be the recourse for the business when something goes wrong?  Do they submit a bug to the developers?  Not so fast, buddy.  That all depends on what is wrong with the system.  Is it a defect in the rules engine, or is it a defect in the logical rules that the business created within the engine?

If it's a defect in the framework then that's fine.  The developers will update their tests, update the software, and deliver it back to the business with the defect fixed.  Of course, now the business needs to start adjusting their use of the framework.  The defect is fixed, but their application might not be.

But what if it's a defect in the rules defined by the business?  That's not the developers' problem.  It wasn't part of the requirements and has nothing to do with the product that the developers have built.  It's the business' problem.  The subject matter experts need to analyze their logic in their rules engine and fix it.  The business users need to debug their software.  Again, that sounds like a really tough pill to swallow.

I've been in this situation enough times to know that they won't swallow that pill.  They'll push it back to the developers to figure out the root cause of the problem and fix it.  After all, the business users aren't software developers.  It's not their job to design and maintain this stuff.  Even though the original intent of the project was for the business users to be able to design and maintain this stuff.

This leaves us in a situation where tons of effort has been put into creating a user-definable framework which isn't being defined by the users.  More cost went into creating it; More cost goes into maintaining it (because it's a lot more complicated then it needs to be); And developer morale has taken a serious hit because they're caught in the middle of this nonsense.  All of this could have been avoided by just writing a custom application that does what the business needs it to do.

Don't chase that holy grail.  The infinitely configurable system is a dream.  Let the business users stick to the business processes that they know and love, and let the developers stick to writing simple and maintainable software.

Friday, November 4, 2011

Clean Code

I've been reading Clean Code lately and have been finding it a bit inspiring.  It's interesting in that it's a fairly quick read, but at the same time it's taking me a while to get through it.  This is primarily because each new subject prompts me to think about my own work and my own development practices.  Many times I'll learn something new (well, not new per se, but a simpler way of looking at something that I always knew but never quite articulated), other times I'll find validation of something I've been trying to instill on other developers.  With each passing section I'll stop, put the book down for a moment, think about what it means, and determine my own opinion of related material.  Many times I'll find that opinion stated in the following paragraph (though far more eloquently than I've been able to as of yet).

The book is even laid out in the manner of clean code.  Short chapters sticking to specific topics, broken up into very small sections each of which maintains a discrete piece of that topic.  The chapters and sections of the book are like classes and functions in code, each maintaining its specific responsibility and elegantly fitting into the larger picture to form a complex topic.  It's a complex thing made of many simple things.

In any event, I figured I'd run through a quick exercise to apply these techniques.  To see where I stand as a clean coder, as it were.  So I jumped on a short example of some code online.  Namely, the C# implementation of the Dining Philosophers Problem at Rosetta Code.  The working implementation, in its entirety, is here:

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
 
namespace DiningPhilosophers
{
  enum DinerState { Eat, Get, Pon }
 
  class Fork                    //       Using an object for locking purposes
  {                             //                         instead of an int.
      public int holder = -1;   //    initialized without a holder (on table)
  }
 
  class Diner
  {
    private static int        instanceCount;
    public         Fork       left;
    public         Fork       right;
    public         int        id;
    public         DinerState state             = new DinerState();
    const          int        maxWaitMs         = 100;
    public         bool       end               = false;
 
    public Diner()
    {
             id     = instanceCount++;
             left   = Program.forks[id];
             right  = Program.forks[(id + 1) % Program.dinerCount];
             state  = DinerState.Get;
      Thread thread = new Thread(new ThreadStart(doStuff));
      thread.Start();             // start instance's thread to doStuff()
    }
 
    public void doStuff()
    {
      do
      {
        if (state == DinerState.Get)
        {
          bool lockedL = false;
          Monitor.TryEnter(left, ref lockedL);      //     try lock L
 
          if (lockedL)                              //  got left fork
          {
            left.holder = id;           //  left fork holder = this
            bool lockedR = false;
            Monitor.TryEnter(right, ref lockedR);     // try lock R
 
            if (lockedR)                //    lock R succeeded too.
            {                           //           got both forks
              right.holder = id;      //      list this as holder
              state = DinerState.Eat; //                     Eat.
              Thread.Sleep(Program.rand.Next(maxWaitMs));
              right.holder = -1;      // no right fork holder now
              Monitor.Exit(right);    //                 unlock R
              left.holder = -1;       //  no left fork holder now
              Monitor.Exit(left);     //                 unlock L
              state = DinerState.Pon; //                  Ponder.
              Thread.Sleep(Program.rand.Next(maxWaitMs));
            }
            else       // got left, but not right, so put left down
            {
              left.holder = -1;       //            no holder now
              System.Threading.Monitor.Exit(left);    // unlock L
              Thread.Sleep(Program.rand.Next(maxWaitMs));
            }
          }
          else           //                 could not get either fork
          {              //                                wait a bit
              Thread.Sleep(Program.rand.Next(maxWaitMs));
          }
        }
        else               //                   state == DinerState.Pon
        {                  //      done pondering, go back to get forks
          state = DinerState.Get;         //      trying to get forks
        }
      } while (!end);
    } 
  }
 
  class Program
  {
    public const  int         dinerCount = 5;
           const  int         runSeconds = 60;
    public static List<diner> diners     = new List<diner>();
    public static List<fork>  forks      = new List<fork>();
    public static Random      rand       = new Random();
 
    static void Main(string[] args)
    {
      for (int i = 0; i < dinerCount; i++) forks.Add(new Fork());
      for (int i = 0; i < dinerCount; i++) diners.Add(new Diner());
      DateTime endTime = DateTime.Now + new TimeSpan(0,0,runSeconds);
      Console.Write("|");                               //   write header
      foreach (Diner d in diners) Console.Write("D " + d.id + "|");
      Console.Write("    |");
      for (int i = 0; i < dinerCount; i++) Console.Write("F" + i + "|");
      Console.WriteLine();
 
      do
      {                                                 // display status
        Console.Write("|");
        foreach (Diner d in diners) Console.Write(d.state + "|");
        Console.Write("    |");
        foreach (Fork f in forks) Console.Write(
            (f.holder < 0 ? "  " : "D" + f.holder) + "|");
        Console.WriteLine();
        Thread.Sleep(1000);                           //   milliseconds
      } while (DateTime.Now < endTime);
 
      foreach (Diner d in diners) d.end = true;         // signal to quit
    }
  }
}

It effectively presents a solution to the given problem, and it does so in a reasonably small web-display-friendly manner.  But it seems kind of... messy.  Let's take a look at the problems found here:
  1. Functions are long.  Too long.
  2. Too much nesting.  Low-level concerns are polluting higher-level concerns.
  3. The comments are formatted any which way, and are small enough and obvious enough that they shouldn't even be needed if the code could just be a little more expressive.
  4. Function names and variable names are ineffective.  doStuff()?  Really?
  5. Classes inter-depend in odd ways.  Lots of tight coupling.
  6. Classes and functions have multiple responsibilities.
There may be more, but that's certainly enough to merit a good cleaning.  So I set about refactoring, but with a specific goal in mind... The resulting output should exactly match the current output.  This is often a very important goal in refactoring code, keeping the efforts transparent to the end user.  After all, the program does what it's supposed to do.  From a business perspective, it shouldn't change.  But from a code perspective, it certainly needs to be cleaned.

It took a couple hours of my time, but the process was actually a lot of fun.  I intentionally didn't document each step along the way, though.  I'd like to, and I will.  But first I need to verify my work.  I intend to do this by setting it aside for a while and then going back to the original and attempting the process again.  Then I'll compare my two results and see how consistent I was in the process.  Depending on the results of that, I may attempt it a third time.

Eventually, I'll be happy with the results and will step through it in a more documented and tracked way, isolating individual improvements to the code and describe what they mean to the overall process.  This will end up in the form of a slide deck and several code samples, perhaps also a screen cast, which I intend to share here at a later time.  But for now at least I can present my initial output of refactoring and cleaning the code.  Let's take a look at each class...

using System;
using System.Threading;

namespace DiningPhilosophers
{
  class Program
  {
    private const int PHILOSOPHER_COUNT = 5;
    private const int SECONDS_TO_RUN = 60;

    private static Random rand;
    private static DateTime endTime;

    static void Main(string[] args)
    {
      InitializeObjects();
      DisplayFormatter.WriteOutputHeader();
      do
      {
        DisplayFormatter.WriteStatus();
        Thread.Sleep(1000);
      } while (DateTime.Now < endTime);
      Terminate();
    }

    private static void InitializeObjects()
    {
      rand = new Random();

      for (var i = 0; i < PHILOSOPHER_COUNT; i++)
        Fork.KnownForks.Add(new Fork());
            
      for (var i = 0; i < PHILOSOPHER_COUNT; i++)
        Philosopher.KnownPhilosophers.Add(
            new Philosopher(
                id: i,
                randomizer: rand,
                leftFork: Fork.KnownForks[i],
                rightFork: Fork.KnownForks[(i + 1) % PHILOSOPHER_COUNT]));
            
      endTime = DateTime.Now.AddSeconds(SECONDS_TO_RUN);
    }

    private static void Terminate()
    {
      foreach (var philosopher in Philosopher.KnownPhilosophers)
        philosopher.Dispose();
    }
  }
}

The Program class is just maintaining the program.  It initializes the global data, determines when to write output elements, and ends the program.  Simple.  Part of cleaning it up included moving the formatting of the output to a separate class to maintain that responsibility...

using System;

namespace DiningPhilosophers
{
  public static class DisplayFormatter
  {
    private const string PHILOSOPHER = "D";
    private const string FORK = "F";
    private const string ITEM_BORDER = "|";
    private const string LIST_DIVIDER = "    ";

    private const string PHILOSOPHER_HEADER_FORMAT = "{0} {1}";
    private const string FORK_HEADER_FORMAT = "{0}{1}";
    private const string PHILOSOPHER_ITEM_FORMAT = "{0}";
    private const string FORK_ITEM_FORMAT = "{0}{1}";

    public static void WriteOutputHeader()
    {
      Console.Write(ITEM_BORDER);

      foreach (var philosopher in Philosopher.KnownPhilosophers)
      {
        Console.Write(string.Format(PHILOSOPHER_HEADER_FORMAT, PHILOSOPHER, philosopher.ID));
        Console.Write(ITEM_BORDER);
      }

      Console.Write(LIST_DIVIDER + ITEM_BORDER);

      for (var i = 0; i < Philosopher.KnownPhilosophers.Count; i++)
      {
        Console.Write(string.Format(FORK_HEADER_FORMAT, FORK, i));
        Console.Write(ITEM_BORDER);
      }

      Console.WriteLine();
    }

    public static void WriteStatus()
    {
      Console.Write(ITEM_BORDER);

      foreach (var philosopher in Philosopher.KnownPhilosophers)
        WritePhilospherStatus(philosopher);

      Console.Write(LIST_DIVIDER + ITEM_BORDER);

      foreach (var fork in Fork.KnownForks)
        WriteForkStatus(fork);

      Console.WriteLine();
    }

    private static void WritePhilospherStatus(Philosopher philosopher)
    {
      var state = FormatDinerStateForDisplay(philosopher.State);
      Console.Write(string.Format(PHILOSOPHER_ITEM_FORMAT, state));
      Console.Write(ITEM_BORDER);
    }

    private static void WriteForkStatus(Fork fork)
    {
      var state = fork.IsHeld ?
                  string.Format(FORK_ITEM_FORMAT, PHILOSOPHER, fork.Holder) :
                  string.Format(FORK_ITEM_FORMAT, " ", " ");
      Console.Write(state);
      Console.Write(ITEM_BORDER);
    }

    private static string FormatDinerStateForDisplay(Philosopher.DinerState state)
    {
      switch (state)
      {
        case Philosopher.DinerState.Eating:
          return "Eat";
        case Philosopher.DinerState.GettingForks:
          return "Get";
        case Philosopher.DinerState.Pondering:
          return "Pon";
      }
      return string.Empty;
    }
  }
}

The DisplayFormatter class is responsible for maintaining the output format.  It primarily just writes the output header and each output line, following formatting specifications in its internal constants.  This and the previous Program class make up the application part of the whole system.  We also have two models within the system; Forks and Philosophers...

using System.Collections.Generic;

namespace DiningPhilosophers
{
  public class Fork
  {
    private static List<fork> _knownForks;
    public static IList<fork> KnownForks
    {
      get
      {
        if (_knownForks == null)
          _knownForks = new List<fork>();
        return _knownForks;
      }
    }

    private const int NOT_HELD = -1;

    public int Holder { get; private set; }

    public bool IsHeld
    {
      get
      {
        return (Holder != NOT_HELD);
      }
    }

    public Fork()
    {
      Holder = NOT_HELD;
    }

    public void TakeFork(int holder)
    {
      Holder = holder;
    }

    public void DropFork()
    {
      Holder = NOT_HELD;
    }
  }
}

The fork is simple enough (as it should be, it's not a complex real-world concept).  It maintains the state of the fork, providing information on who is holding it as well as methods to pick it up and drop it.

using System;
using System.Collections.Generic;
using System.Threading;

namespace DiningPhilosophers
{
  public class Philosopher : IDisposable
  {
    private static List<philosopher> _knownPhilosophers;
    public static IList<philosopher> KnownPhilosophers
    {
      get
      {
        if (_knownPhilosophers == null)
          _knownPhilosophers = new List<philosopher>();
        return _knownPhilosophers;
      }
    }

    public int ID { get; private set; }

    public DinerState State { get; private set; }

    private const int MAX_WAIT_TIME = 100;

    private Fork _leftFork;
    private Fork _rightFork;
    private bool _currentlyHoldingLeftFork;
    private bool _currentlyHoldingRightFork;
    private bool _doneEating;
    private Random _randomizer;

    private Philosopher()
    {
      SetDefaultValues();
      Dine();
    }

    public Philosopher(int id, Random randomizer, Fork leftFork, Fork rightFork) : this()
    {
      ID = id;
      _randomizer = randomizer;
      _leftFork = leftFork;
      _rightFork = rightFork;
    }

    private void SetDefaultValues()
    {
      _currentlyHoldingLeftFork = false;
      _currentlyHoldingRightFork = false;
      _doneEating = false;
      State = DinerState.GettingForks;
    }

    private void Dine()
    {
      var thread = new Thread(new ThreadStart(TryToEat));
      thread.Start();
    }

    private void TryToEat()
    {
      do
      {
        if (State == DinerState.GettingForks)
        {
          GetLeftFork();

          if (_currentlyHoldingLeftFork)
          {
            GetRightFork();

            if (_currentlyHoldingRightFork)
            {
              Eat();
              DropRightFork();
              DropLeftFork();
              Ponder();
            }
            else
            {
              DropLeftFork();
              Wait();
            }
          }
          else
          {
            Wait();
          }
        }
        else
        {
          State = DinerState.GettingForks;
        }
      } while (!_doneEating);
    }

    public void Dispose()
    {
      _doneEating = true;
    }

    private void GetLeftFork()
    {
      Monitor.TryEnter(_leftFork, ref _currentlyHoldingLeftFork);
      if (_currentlyHoldingLeftFork)
        _leftFork.TakeFork(ID);
    }

    private void GetRightFork()
    {
      Monitor.TryEnter(_rightFork, ref _currentlyHoldingRightFork);
      if (_currentlyHoldingRightFork)
        _rightFork.TakeFork(ID);
    }

    private void DropLeftFork()
    {
      _leftFork.DropFork();
      _currentlyHoldingLeftFork = false;
      System.Threading.Monitor.Exit(_leftFork);
    }

    private void DropRightFork()
    {
      _rightFork.DropFork();
      _currentlyHoldingRightFork = false;
      Monitor.Exit(_rightFork);
    }

    private void Eat()
    {
      State = DinerState.Eating;
      Wait();
    }

    private void Ponder()
    {
      State = DinerState.Pondering;
      Wait();
    }

    private void Wait()
    {
      Thread.Sleep(_randomizer.Next(MAX_WAIT_TIME));
    }

    public enum DinerState
    {
      Eating,
      GettingForks,
      Pondering
    }
  }
}

The Philosopher is where the bulk of the algorithm logic resides, naturally.  After all, it's the philosophers themselves who are trying to eat their meals.  So the decision-making process of what to do with forks and food exists in here.

This implementation probably isn't as clean as it could be.  (But then, what is?)  Looking at it now I see a few things I don't entirely like and may want to refactor:
  1. The Fork and Philosopher classes also contain their globals.  Those should probably be moved out into a class which manages the globals and nothing else.
  2. The TryToEat() method still has an uncomfortable amount of nesting.  It's the main algorithm of the whole thing, so one can expect it to be the most complex unit in the implementation.  But I'd like to re-think how I can write this.  It currently follows the "happy path" pattern of nesting, and I'd like to find a way to invert that.
  3. Forks know who their holders are.  That doesn't really model the real-life concept of a fork.  A fork doesn't know who is holding it; It doesn't know anything.  In a real-world scenario, the philosophers would instead know who their neighbors are and be able to perceive by looking at their neighbors whether their forks are in use.  So I'd like to move fork state management out of the Fork class and into the Philosopher class.
  4. Should the Program class initialize global state of the application?  Or should that be moved out into an initializer class of some sort?  I didn't think of it while writing the code, but while describing the code any time I find myself using the word "and" in the description of a class' responsibilities I feel like I may be violating Single Responsibility.
Again, I'm sure there's more.  And I welcome the input on what that may be.

It's interesting that the simple act of writing about the code has caused me to critically look at the code even further than when I was writing it.  At the start of this post I had figured that this would be a good initial implementation, and now I'm not so sure.

But then this is the nature of any creative work, is it not?  I said to a colleague yesterday that an ongoing theme in my career for the past couple of years has been that any code I wrote at least three months ago is crap.  This means that anything I write today will be crap three months from now.  This is a good problem to have.  It means that my cycle of learning new things and improving my skills is fairly short and constantly moving.  Now I'm finding that code I wrote yesterday isn't up to par.  Again, it's a good problem to have.

I'm still going to continue with the plan I'd laid out earlier in this post.  But it seems now that I have some more refactoring to do before I start on the second attempt.  It seems also that I may be refining this for a while before I find it to be presentation-ready.  Of course, I also run the risk of an endless loop whereby the time it takes to create the presentation and the act of creating it combine to form a result of not being happy with what I'm presenting.  So this will also be an opportunity to find that professional "good enough" sweet spot.  At the very least, this will be good practice on several levels.

Thursday, November 3, 2011

Validation Shouldn't Prevent Saving

I know, I've covered this, right?  But Martin Fowler just tweeted a link to something he wrote a while back and I think it adds some important context to the overall idea.  Specifically, the last paragraph of what he wrote:
"In About Face Alan Cooper advocated that we shouldn't let our ideas of valid states prevent a user from entering (and saving) incomplete information. I was reminded by this a few days ago when reading a draft of a book that Jimmy Nilsson is working on. He stated a principle that you should always be able to save an object, even if it has errors in it. While I'm not convinced that this should be an absolute rule, I do think people tend to prevent saving more than they ought. Thinking about the context for validation may help prevent that."
I'm reminded of a concept they had at a previous job of mine.  Their overall implementation wasn't good at all, but the business goal to which they were striving was certainly a good one.  Data should always be saved.  The idea was that validity of an object depended heavily on state.  The system they had was mainly data entry.  There was some workflow to move records from group to group, department to department, all depending on state.

But in any given department there were very few input restrictions.  Users could always save the data, even if it didn't make a whole lot of sense.  Validation occurred in the workflow when the data was to be presented to another group/department.  At that point, it needed to be valid for the target users.  That is, the person who ships it off to another group needs to have completed everything for which they are responsible.  But when the data was within a single point on the entire workflow, it was very free-form and transient.  It was a scratch pad, able to be saved and returned to at a later time.

FogBugs takes a very similar approach to user input as well.  They don't put a lot of restrictions on user input.  If a user has noticed a bug, it needs to be entered into the system.  Maybe they can't accurately describe the steps to reproduce right now, maybe they don't entirely know what "module" they're in.  Those are details, and they can be added later.  But there are no barriers to simply starting a ticket.  Whatever information the user has right now is good enough.  Let them save.  It may not be "valid" enough for a developer to receive it as a work item, but it's always "valid" enough to be recorded in the system.

Users don't like barriers.  They end up either working around them or giving up.  In the former case, you have unknown things happening in your system.  In the latter case, you're losing data.  Neither of which are very appealing outcomes.  Reducing those barriers is important.  It makes things a little more complex in the sense that pure rigidity of validation isn't quite so black and white, but state-driven validation isn't difficult to implement either.

So yes, my previous model was overly-rigid.  It served to demonstrate a point, that's all.  But adding state-driven validation within the object is just as easy, really.  Objects can have statuses, methods to move them from one status to another, etc.  Consider this model:

public class Band
{
  private string _name;
  public string Name
  {
    get
    {
      return _name;
    }
    set
    {
      // Perform state-driven validation
      _name = value;
    }
  }

  private string _genre;
  public string Genre
  {
    get
    {
      return _genre;
    }
    set
    {
      // Perform state-driven validation
      _genre = value;
    }
  }

  private Status _currentStatus;
  public Status CurrentStatus
  {
    get
    {
      return _currentStatus;
    }
  }

  private Band()
  {
    // Set the default status
  }

  public Band(string name, string genre) : this()
  {
    Name = name;
    Genre = genre;
  }

  public void MoveToNextStatus()
  {
    // Based on various states and the current status,
    // validate the object with custom logic and update
    // the _currentStatus to the next business state.
  }
}

As with any domain model, it's just business logic.  There is a business concept of a scratch pad and the validation of the model implements that concept.  More rigorous validation is then performed when an action is performed on the model, such as moving it to the next business status.  There could be various private methods to handle this internal validation, and the property setters just do some basic validation.  For example, maybe one department can't set certain fields.  Then the setters would check to see if the current status is in that department and throw an exception accordingly.

It sounds like it's a little more complicated, but honestly it makes things a lot easier from a business perspective.  Remember, writing the software is the easy part.  Getting the business to actually define what it wants to validate the software against that definition... That's the hard part.  This kind of approach, in my experience, makes it a hell of a lot easier for the business.  Many times I've sat down with a business trying to define and model their concepts and objects, and when I ask deceptively simple questions such as "What fields are required?" we invariably end up in multi-hour discussions with several other business users and subject matter experts to try to find a balance between everyone's interpretation of that business concept.  And ultimately that one definition of what fields are required ends up causing a headache for another user somewhere.

Driving the validation by the workflow is just a more natural approach for many businesses.  And supporting that in the software makes for a more natural implementation of the business.

New England Code Camp 16

This past Saturday was another local software conference.  (I'll admit, I'm hooked on them.)  This time I headed back over to the Microsoft office in Waltham for New England Code Camp 16.  (Side note: I overheard someone saying that Microsoft will be pulling out of the Waltham location and running everything in the area from their NERD Center in Cambridge and another campus they have nearby.)

All in all a good event, as always.  This time my employer was a sponsor and we had more of a presence there, which is cool.  I hope we can step it up and have a proper table with swag and recruiting and all that next time.  One of our team members was also giving a presentation so it was good for us to show support.  I didn't end up going to his, though, because there was other interesting stuff for me to attend and learn something.

So, as per my usual style here, let's take a look at the sessions I attended...

9:10 - "Easy Async With .NET 4.5" with John Bowen

Not bad.  The guy definitely needs to work on his public speaking.  (And his public shutting up... He'd go off on tangents and just get lost for a while.)  He clearly knew the material, but had a bit of trouble explaining it.  Cool stuff though, and I'm seriously looking forward to using async and await on a regular basis.  If I could give this guy one suggestion it would be to have more concrete examples.  After all, this is Code Camp.  A slide deck doesn't cut it here :)

10:30 - "ASP.NET MVC 3 Introduction, Part 1" with Brock Allen from DevelopMentor

Very engaging speaker, I liked this guy a lot.  The material in this part of the presentation was mostly review for me, but there were certainly little nooks and crannies of the material that were new to me.  And it's always good to re-sync with others on material one already knows just to see how well my understanding of it jives with the rest of the industry.

My favorite part was that I very much agree with how this guy develops.  At one point someone asked him about some of the widgets from WebForms and what their counterparts were here and he tried to explain that there aren't any.  Things like the login control and registration wizard and stuff like that.  He said things like, "It's really not hard to just write a couple of views and controller actions, so why not do that?  Sure, you could look for some widget that does all the work for you, but you lose the more fine-grained control over what you're doing.  And one of the biggest benefit of MVC views is that you have that much more control over the resulting markup."  It's always nice to see an industry partner who thinks like I do, because all too often I'm immersed in the Kool-Aid that I refuse to drink.

One interesting thing happened during this presentation.  Someone started asking about ViewState, and the presenter explained that it's not there anymore.  This apparently shocked and dismayed people.  (That fact, in tern, somewhat dismayed me.)  He explained that you can still use "ViewState" in the sense that you can still serialize a dictionary, base-64 encode it, and store it in a hidden field.  The fact that many people in the audience didn't understand what he meant was also a little disconcerting.

But then a frightening question was asked... "Without ViewState, where should I store sensitive information like Social Security Numbers?"  My hand shot up in the air to respond.  The presenter handled the question well, explaining that sensitive information should be avoided and statefulness should be avoided and all that good stuff.  But I still had a point that needed to be driven into the asker.  When I was called upon, I tried to explain to the asker that ViewState is completely open and readable by the user and should never be used to store any sensitive information.  Ever.  But I don't think he understood.  Sometimes I'm ashamed of my industry, I really am.

11:50 - "ASP.NET MVC 3 Introduction, Part 2" (same presenter)

This is where things got a lot more involved with the stuff specific to version 3.  There was a lot to take from this.  (Both talks were very fast-paced and involved lots of jumping around in Visual Studio.  The guy said that these are usually longer talks that he had to condense.)  It was mostly about the various tooling that's available to help with things, including Data Annotations :)

You know me, I have sort of a love-hate relationship with frameworks and tooling.  But this guy kept it together really well even for me.  Use the tooling where it's appropriate, don't rely on it where it's dangerous.  In fact, he even made it a point to use Data Annotations on his input models and not his domain models.  He had the domain models in a separate project and in his MVC project just had view models and input models.  (I bet he's a fan of FubuMVC.)

Honestly, what I liked most about this presenter was his approach to coding.  Clean code all the way.  He never left anything messy or unfinished.  No placeholders, no TODO statements, none of that.  He kept it clean and concise and to the point.  If something worked but needed to be cleaned up a little, he took a moment to clean it up.  Much respect, Brock.  Much respect.

1:00 - Lunch (pizza), provided by Telerik

1:30 - "Objective-C for C# and Other .NET Developers" with Chris Pels

I'm definitely interested in the material, and clearly so is the speaker.  But the presentation itself was kind of a bust.  His screen saver kept triggering, he kept having to fiddle with the MacOS environment to get things to work, etc.  Basically, he didn't seem to know his way around the presentation tools.  That kind of thing will kill a presentation, man.  If you have to stop presenting for a few minutes in order to fix something, you lose the audience.

I think the focus of the talk was lost.  He went into some Objective-C and general iOS development in XCode, but he didn't really tailor it to the audience.  We were expecting something from the perspective of people who live and breathe within Visual Studio.  This wasn't it.  He'd occasionally try to tie things back to C# jargon, but it was kind of forced and ineffective.

And he didn't handle questions well.  As you'd expect, some people were asking about how to leverage the .NET code they already have and how much of it would need to be re-written for iOS devices.  He couldn't really say much more than "look into Mono."  Something more apt may have been along the lines of...

"Well, that depends a lot on how the existing .NET code is architected.  In an ideal situation, and if you can make use of MonoTouch, then you should only have to re-write the UI portions.  The views still need to be created in XCode just like any other native iOS development, but all of the code that drives them (models and controllers, mainly) can be in .NET through MonoTouch.  Just as .NET is a framework layer sitting on top of the Win32 API (and now the WinRT API as well), MonoTouch is a .NET framework layer sitting on top of the iOS API.  Just like there are some differences between Win32 and WinRT, expect that there are also differences between Win32 and iOS.  Some things are available in one but not the other.  So you may have to adjust some of your code to be truly cross-platform, maybe abstract out some of the platform-specific stuff into services that you can swap between them.  But as long as the code is separated out away from the views, then you should be able to leverage much of what you have."

Was that so hard?  I just made that up on the spot.  I know, public speaking is hard.  But you've got to stay focused and give clear and engaging responses or you're going to lose the audience.  I doubt anybody in that room walked away excited about doing iOS development as a .NET developer.  Much of the murmuring seemed to indicate that they just won't bother and will wait until Microsoft saves the day with its own tablet.

2:50 - "Introduction to Windows Identity Foundation" with Brock Allen (same presenter as the MVC stuff earlier)

Definitely comprehensive, and as you know I like this speaker.  The material was a little tough to follow, though.  I haven't done a lot of authentication stuff myself, so he was moving across subject matter that was a little foreign to me in some cases.  But he kept it within reach and I'm definitely continuing to learn about it.

What I'd really like to see is a more complete end-to-end implementation, like he had with the MVC 3 talks he gave earlier.  This one involved more slides and less code.  He did tie it together very well at the end, which was great.  As before, very fast-paced and a lot to cover.  I'm sure that someone equally as familiar with the material as I was with MVC enjoyed this talk as much as I did the earlier one.

The juiciest nugget was when he told us to go to http://www.leastprivilege.com/ to learn more, and claimed that there's a free eBook on Windows Identity Foundation there somewhere which he highly recommends.  I haven't found it yet, but I'll look again later.

4:10 - I took a break during this session.  My eyes hurt from projectors and monitors.  I'm getting old.

5:30 - ".NET and MongoDB: A Code First Introduction to NoSQL" with John Zablocki

By this time of the evening it was very dark outside and the snow was really coming down, so a lot of people had left already and those of us still there were finding ourselves in a much more relaxed atmosphere.  So I think that contributed to the fact that this talk didn't stay on topic very effectively.

It was a great introduction to MongoDB, and covered some stuff I didn't already know about the database, which was great.  But we spent way too much time mucking around in the Mongo shell and hardly even looked at any .NET code.  Seriously, what's "code first" about mucking around in the database and then writing a little code to use it?

The guy's got NoSQL skills, no doubt about that.  But this audience isn't looking for a young ninja to show them something cool.  These are mostly Microsoft Kool-Aid drinking developers in their 30s and 40s who want to know how this can help them with the work they already do.  It's an older crowd and teaching them new tricks requires a little understanding of how they currently do things.  (And I'm no exception to that, though I like to think I'm further away from them on the overall scale.)

Basically, the guy got really side-tracked by his intro into MongoDB and barely even touched the agenda he set forth for the presentation.  There was almost no .NET code.

In Conclusion

Looking back at what I've written, I sound very critical of the presenters.  I guess that's just how I approach these things.  But don't get the wrong message.  This event and other events like it are fantastic.  I learned a lot, I networked with colleagues, I advanced my career ever so slightly.

In fact, I'm inspired to put together some presentations of my own.  I'd been toying with the idea, but couldn't think of something that would be worth everyone else's time.  The problem, as it turns out, was that I was trying to think of talks that I could give for my colleagues.  But that's just it... I took this job because my new colleagues are really good at this stuff.  I'm here to learn from them.  And I love it.

Sitting in the room during the MVC 3 talks cleared my head a little bit.  Not everyone is surrounded by the developers who surround me.  There are a lot of developers out there who just do the same stuff day in and day out.  They don't take for granted all of the things I do.  I can give a talk that introduces other developers to the concepts I take for granted in my development, and there's an audience for that.

I'm thinking I'd like to start with a simple introduction into dependency injection.  It's only an hour of time, I can certainly fill that.  I've come up with a basic agenda and have thrown together some code samples that I can walk through (or build on the fly, to engage the audience a little more).  I just need to remember to keep focus on the material.  Don't get lost on a tangent, don't focus on everything else being perfectly designed, etc.  Just put together a simple application and demonstrate the concepts of dependency injection.  That's all.

Another idea could be to take a look at MongoDB again and how it can be used in a .NET application.  Focus less on the internals of the database engine, there's plenty of material out there for that.  Just use it in a simple application to demonstrate how it would be used.

Any other ideas?  Man, I can't wait until the next Code Camp.

Monday, October 31, 2011

The Various Flavors of Validation

I recently wrote a post about Microsoft's Data Annotations and how they can misdirect one to believe that one's data is being validated when it really isn't. Essentially, what I'm against is relying on these annotations within the domain models for validation. All they really do is tell certain UI implementations (MVC3 web applications, for example) about their data validation. And they only do this if the UI implementation accepts it. The UI has to check the ModelState, etc. They're not really validating the data, they're just providing suggestions that other components may or may not heed.

This led to the question among colleagues... Where should the validation live? There is as much debate over this as there are tools and frameworks to assist in the endeavor. So my opinion on the matter is just that, an opinion. But as far as I'm concerned the validation should live everywhere. Each part of the system should internally maintain its own validation.

The idea is simple. There should be multiple points of validation because there are multiple reasons for validating the data. Will this lead to code duplication? In some cases, quite possibly. And I wholeheartedly agree that code duplication is a bad thing. But just because two lines of code do the same thing, are they really duplicated? It depends. If they do the same thing for the same reason and are generally in the same context, then yes. But if they coincidentally do the same thing but for entirely different reasons, then no. They are not duplicated.

Let's take a look at an example. My domain has a model:

public class Band
{
  private string _name;
  public string Name
  {
    get
    {
      return _name;
    }
    set
    {
      if (string.IsNullOrWhiteSpace(value))
        throw new ArgumentException("Name cannot be empty.");
      _name = value;
    }
  }

  private string _genre;
  public string Genre
  {
    get
    {
      return _genre;
    }
    set
    {
      if (string.IsNullOrWhiteSpace(value))
        throw new ArgumentException("Genre cannot be empty.");
      _genre = value;
    }
  }

  private Band() { }

  public Band(string name, string genre)
  {
    Name = name;
    Genre = genre;
  }
}

As before, this model is internally guaranteeing its state. There are no annotations on bare auto-properties to suggest that a property shouldn't be null. There is actual code preventing a null or empty value from being used in the creation of the model. The model can't exist in an invalid state. Data validation exists here.

Now, it stands to reason that the UI shouldn't rely solely on this validation. Why? Because it would be a terrible user experience. Any invalid input would result in an exception to be handled by the application. Even if the exception is handled well and the user is presented with a helpful message, it would still mean that the user sees only the first error on any given attempt. Trying to submit a large form with lots of invalid fields would be an an irritating trial-and-error process.

So we need to do more validation. The model is validating the business logic. But the UI also needs to validate the input. It's roughly the same thing, but it's for an entirely different reason. The UI doesn't care about business logic, and the domain doesn't care about UX. So even though both systems need to "make sure the Name isn't empty" they're doing it for entirely different reasons.

Naturally, because I don't want tight coupling and because my UI and my domain should have the freedom to change independently of one another, I don't want my UI to be bound to the domain models. So for my MVC3 UI I'll go ahead and create a view model:

public class BandViewModel
{
  [Required]
  public string Name { get; set; }
        
  [Required]
  public string Genre { get; set; }
}

There's a lot more validation that can (and probably should) be done, but you get the idea. This is where these data annotations can be useful. As long as the developer(s) working on the MVC3 application are all aware of the standard being used. (And it's a pretty common standard, so it's not a lot to expect.) So the controller can make use of the annotations accordingly:

public class BandController : Controller
{
  public ActionResult Index()
  {
    return View(
        repository.Get()
                  .Select(b => new BandViewModel {
                      Name = b.Name,
                      Genre = b.Genre
                  })
    );
  }

  [HttpGet]
  public ActionResult Edit(string name)
  {
    if (ModelState.IsValid)
    {
      var band = repository.Get(name);
      return View(
        new BandViewModel
        {
          Name = band.Name,
          Genre = band.Genre
        }
      );
    }
    return View();
  }

  [HttpPost]
  public ActionResult Edit(BandViewModel band)
  {
    if (ModelState.IsValid)
    {
     repository.Save(
        new Domain.Band(band.Name, band.Genre)
      );
      return RedirectToAction("Index");
    }

    return View(band);
  }
}

The data annotations work fine on the view models because they're a UI concern. The controller makes use of them because the controller is a UI concern. (Well, ok, the View is the UI concern within the context of the MVC3 project. The entire project, however, is a UI concern within the context of the domain. It exists to handle interactions with users to the domain API.)

So the validation now exists in two places. But for two entirely different reasons. Additionally, the database is going to have validation on it as well. The column which stores the Name value is probably not going to allow NULL values. Indeed, it may go further and have a UNIQUE constraint on that column. This is data validation, too. And it's already happening in systems all over the place. So the "duplication" is already there. Makes sense, right? You wouldn't rely entirely on the UI input validation for your entire system and just store the values in unadorned columns in a flat table, would you? Of course not.

We're up to three places validating data now. The application validates the input and interactions with the system, the domain models validate the business logic and ensure correctness therein, and the database (and potentially data access layer, since at the very least each method should check its inputs) validates the integrity of the data at rest.

Well, since this is a web application, we have the potential for a fourth place to validate data. The client-side UI. It's not necessary, of course. It's just a matter of design preference. Weigh the cost of post-backs with the cost of maintaining a little more code, determine your bandwidth constraints, etc. But if you end up wanting that little extra UX goodness to assist the user in filling out your form before posting it, then you can add more validation:

<body>
  @Html.ValidationSummary("Please fix your errors.")
  <div>
    @using (Html.BeginForm("Edit", "Band"))
    {
      <fieldset>
        Name: @Html.TextBox("Name", Model.Name)<br />
        Genre: @Html.TextBox("Genre", Model.Genre)
      </fieldset>
      <button type="submit">Save</button>
    }
  </div>
  <script type="text/javascript">
    $(document).ready(function () {
      $('#Name').blur(function () {
        if ($(this).val() == '') {
          $(this).css('background', '#ffeeee');
        }
      });
      $('#Genre').blur(function () {
        if ($(this).val() == '') {
          $(this).css('background', '#ffeeee');
        }
      });
    });
  </script>
</body>

(You'll want prettier forms and friendlier validation, of course. But you get the idea.)

For a simple form like this it's not necessary, but for a larger and more interactive application it's often really nice to have client-side code to assist the user in interacting with the application. But once again, this is a duplication of logic. As before, however, it's duplicating the logic for an entirely different purpose. A purpose outside the scope of the other places where the logic exists. The end result is the same, but the contextual meaning is different.

So where does validation live?
  • In the domain models - Validate the state of the models at every step.  Don't allow invalid models to be created and then rely on some .IsValid() construct.  Prevent invalid state from ever existing on the data in motion.
  • In the database - Validate the integrity of the data at rest.  Don't allow invalid data to be persisted.  One application may be reliable enough to provide only valid data, but others may not.  And admins have a habit of directly editing data.  Make sure the data model rigidly maintains the integrity of the data.
  • In the application - Validate the user's input into the system.  Present friendly error messages, of course.  But this is the first line of defense for the rest of the system.  Armed with the knowledge that the business logic won't tolerate invalid data, make sure it's valid before even consulting with the business logic.
  • In the client-side UI - Provide data validation cues to assist the user and maintain a friendly UX.

That's a lot of validation. But notice how each one performs a distinctly different function for the overall system. It's repeating the same concept, but in a different context.

Now what happens if something needs to change? Well, then it'll need to be changed anywhere it's applicable. Is that duplicated effort? Again, it depends. Maybe all you need to change is the UI validation for some different visual cues. Maybe you need to make a more strict or less strict application without actually changing the business logic. Maybe you need to change some validation in the database because you're changing the data model and the new model needs to enforce the business logic differently.

If you change the core business logic then, yes, you will likely have to make changes to the other parts of the system. This isn't the end of the world. And it makes sense, doesn't it? If you change the validation logic in the domain models then you've changed the shape of your models. You've changed your fundamental business concepts in your domain specific language. So naturally you'll want to check the database implementation to make sure it can still persist them properly (and look for any data migration that needs to happen on existing data), and you'll want to check your applications and interfaces to make sure they're interacting with the models correctly (and look for any cases where the logical change presents a drastic UI change since users may need to be warned and/or re-trained).

It's not a lot of code. And it's not a lot to remember. There is a feeling of duplication which we as developers find distasteful, but it's keeping the concerns separated and atomic in their own implementations. Besides, if you miss one then your automated tests will catch it quickly. Won't they?

Monday, October 17, 2011

Startup Weekend

This weekend I was at Startup Weekend in Boston, and words fail me in an attempt to describe how awesome (and exhausting) the event was.  I'm definitely glad I went, and will be attending these events in the future.  The entrepreneurial spirit and drive throughout the weekend was inspiring, not to mention the friends and business contacts made while working together on a project like that.

The idea behind the weekend is simple.  People pitch their start-up business ideas the first night, everybody votes on which ones they want to implement, and the top projects (I think it was 16 of them) recruit teams among the rest of the attendees to develop the business plan, prototype, etc. throughout the weekend.  There was a wide variety of ideas being pitched, so there was easily something for everyone.

Being a web developer by trade, I was specifically looking for something where I could flex those particular muscles.  There was no shortage of mobile phone apps being developed, and maybe I'll do one of those next time.  But I wanted to stick to a web application this time so I chose a team in that space.  (I'm going to hold off on marketing the end result here until we've vetted the system a little more.  I'm not entirely sure why, and I don't have the wherewithal to explain it until the caffeine sets in, so I'm just going to go with my gut for now.)

The really interesting part for me was that I ended up not really doing a lot of development.  I knew that was going to be a challenge going into the event because I don't really do start-up work.  I'm not a cowboy who throws code at something to rapidly build a prototype.  I'm a methodical enterprise developer.  A whole product in a weekend?  It takes us at least a week just to talk about building a product, then a few more weeks for requirements and design.

So I ended up in a role where I sort of oversaw the development, making decisions about the technology so the developers could keep moving forward, and trying to drive it toward the business goal.  When I had no hands-on development tasks (such as the occasional jQuery widget to fit into the designer's vision), I helped with the business development and product envisioning.

It was kind of weird, really.  Not heads-down coding?  Standing around the designer's desk and helping the group envision their product presentation?  Helping to film a commercial?  These are not my normal tasks.  But, honestly, it worked.  I like to think I was pretty good at it, and I definitely enjoyed it.  Don't get me wrong, I wanted to code.  But the goal wasn't to have a fully-functional website by the end of the weekend.  (Although some of the other teams seem to have managed that, or something close to it, but those were different products with different business drivers.)  The goal was to make the final pitch at the end of the event.

So I imagine I'll be doing a lot more development as we continue with this endeavor.  (That's right, we weren't playing around.  Nobody there was just toying with the notion.  These people are really building start-ups and I really want to be a part of that.)  Though there's much to be discussed in that.  I mean, what we did write was in PHP.  I don't know if I want to do that again.  We'll see how things go with the team once we've decompressed from the weekend and re-group perhaps next weekend.

We're not building "the next Facebook" or anything of that sort.  But the team leader had an idea and pitched the idea and we've come together to make it happen.  Whether or not it works is for the market to decide.

Wednesday, October 12, 2011

LINQ2GEDCOM

I haven't done much with that gNealogy project in a while, mostly because it's in a state where the next step is visualizations of the data and I'm just not much of a UI guy in that regard. Maybe a flash of inspiration will hit me someday and I'll try something there. But for now I'm content to wait a little longer until Adrian has some space clock cycles to make some pretty visualizations for the project.

One thing never quite sat right with me, though. I'd encapsulated the data access behind some repositories to hide the mess, but it was still a mess. (And still is, until I replace it with my new thing.) I wanted to make something better. Something more streamlined. And, after reading C# in Depth (almost done with it), something more elegant and idiomatic of the language.

So I've set out to create a LINQ data context for GEDCOM files. Much of the data context and IQueryable stuff came from a tutorial I found online. There are several of them, but this one seemed just right for what I was doing. I've extended the functionality to add more entity types, made the tutorial-driven stuff generic for those types, etc.

All in all, I'm really happy with where this project is going so far. There's still a lot to do, and there may still be better ways to do a lot of it. But what it supports already in terms of file data and interacting with the data source (including writing changes back to the file, something that gNealogy never had) is already pretty cool. There's a bit of clean-up left to do in the wake of new features, especially after today's changes, so it's not production-ready by any means. But the main thing is... it's a lot of fun.

So this leaves the current state of my GitHub repositories as:
  • LINQ2GEDCOM - Currently in active development.  Lots of features to add.  My favorite work yet.
  • gNealogy - Not actively being developed right now, but still good.  I need to add the use of LINQ2GEDCOM once that's ready, which would make this one a lot simpler.  I'd also like to change around some of the overall structure here.  The main thing is, it's ready for visualization development if anybody wants to do that.
  • FormGuard - That simple little jQuery plugin I wrote a while back.  Nothing special, certainly nothing to write home about, just a quick proof of concept for myself in terms of writing a jQuery plugin.  Hopefully a stepping stone to more as the need arises.
  • MarkovSmirnoff - A silly little Markov Chain text generator I wrote because Trevor and I wanted to generate random text for some stuff.  It's not elegant, it's not particularly great, it's just fun to play with.
  • CommonUtilities - Nothing special here at all.  This is mostly a placeholder for a personal utility library I'd like to keep around.  I put some of my older stuff in here, but the implementations need to be improved quite a bit.  There's lots to be added here.
And just when I thought that was enough, Trevor had an awesome idea for a game that I want to implement in HTML5/JavaScript.  Hopefully we can get that project off the ground so I can share it here as it grows.  But it's the kind of project that's going to take a lot of planning and design work before any actual coding can begin, so it'll probably be a while.

I love it when a plan comes together.

Sunday, October 9, 2011

Boston Application Security Conference

Yesterday I attended BASC 2011 and had a pretty good time. There were some particularly interesting talks and demos, and I'd be lying if I said I didn't learn a thing or two about web and mobile application security. I'm far from an expert, of course, but it's always good to immerse myself in some knowledge from time to time in the hopes that some useful bits will stick

In the effort to retain some of it, I figured I'd write a short post on what I experienced while I was there...

8:30 - Breakfast

Sponsored by Rapid7. Good selection ofnbagels and spreads, lots to drink. Good stuff.

9:00 - Key note session by Rob Cheyne.

Great stuff. Not only was he an engaging speaker, but he had a lot of good insights on information security in the enterprise. Not just from a technical perspective, but more importantly from a corporate mindset perspective. (On a side note, he looked a lot like the bad guy from Hackers. Kind of funny.)

My favorite part was a story he told about groupthink. There was an experiment conducted a while back on primate social behavior involving 5 monkeys in a closed cage. (Not a small cage or anything like that, there was plenty of room.) In the center of the cage was a ladder and at the top was a banana. Naturally, the monkeys tried to climb the ladder to retrieve it. But any time one of them was just about to reach the banana, they were all sprayed with ice cold water. So very quickly the monkeys learned that trying to get the banana was bad.

One of the monkeys was removed and replaced with a new monkey. From this point on the water was never used again. Now, the new monkey naturally tried to reach the banana. But as soon as he began trying, the other monkeys started beating on him to stop him. He had no idea what was going on, but they had learned that if anybody went for the banana then they'd all be sprayed. The new monkey never knew about the water, but he did quickly learn that reaching for the banana was bad.

One by one, each monkey was replaced. And each time the results were the same. The new monkey would try to get the banana and the others would stop him. This continued until every monkey had been replaced. So, in the end, there was a cage full of monkeys who knew that trying to get the banana was bad. But none of them knew why. None of them had experienced the cold water. All they knew was that "this is just how things are done around here." (Reminds me of the groupthink at a financial institution for which I used to work. They've since gone bankrupt and no longer exist as a company.)

It's a great story for any corporate environment, really. How many places have you worked where that story can apply? From an information security perspective it's particularly applicable. Most enterprises approach security in a very reactive manner. They get burned by something once, they create some policy or procedure to address that one thing, and then they go on living with that. Even if that one thing isn't a threat anymore, even if they didn't properly address it in the first place, even if the threat went away on its own but they somehow believe that their actions prevent it... The new way of doing things gets baked into the corporate culture and becomes "just the way we do things here."

A number of great quotes and one-liners came from audience participation as well. One of the attendees said, "Security is an illusion, risk is real." I chimed in with, "The system is only as secure as the actions of its users." All in all it was a great talk. If for no other reason than to get in the right mindset, enterprise leaders (both technical and non-technical) should listen to this guy from time to time.

10:00 - "Reversing Web Applications" with Andrew Wilson from Trustwave SpiderLabs.

Pretty good. He talked about information gathering for web applications, reverse engineering to discern compromising information, etc. Not as engaging, but actually filled with useful content. I learned a few things here that seem obvious in terms of web application security in hindsight, but sometimes we just need someone to point out the obvious for us.

11:00 - "The Perils of JavaScript APIs" with Ming Chow from the Tufts University Department of Computer Science.

This one was... an interesting grab bag of misinformation. I don't know if it was a language barrier, though the guy seemed to have a pretty good command of English. But the information he was conveying was in many cases misleading and in some cases downright incorrect.

For example, at one point he was talking about web workers in JavaScript. On his PowerPoint slide he had some indication of the restriction that web workers will only run if the page is served over HTTP. That is, if you just open the file locally with "file://" then they won't run. Seems fair enough. But he said it as "web workers won't run on the local file system, they have to be run from the server." Somebody in the group asked, "Wait, do they really run on the server? That is, does the page actually kick off a server task for this? It doesn't run on the local computer?" He responded with "Um, well, I don't really know. But I do know that it won't run from the local file system, it runs from the server." Misleading at best, downright incorrect at worst.

He spent his time talking about how powerful JavaScript has become and that now with the introduction of HTML5 the in-browser experience is more powerful than ever. He said that it's come a long way from the days of simple little JavaScript animations and browser tricks and now we can have entire applications running just in the browser. During all of this, he kept indicating that it's become "too powerful" and that it introduces too many risks. Basically he was saying that client-side code can now become very dangerous because of these capabilities and if the application is client-side then anybody can hack into it.

At one point he said, "Now we can't trust data from the client." Now? Seriously? We've never been able to trust data from the client. This is nothing new. Is this guy saying that he's never felt a need to validate user input before? That's a little scary. Most of the insights he had on the state of JavaScript were insights from a couple years ago. Most of his opinions were misled. Most of his information was incorrect. I shudder to think what he's doing to his students at Tufts.

(By the way, the JavaScript image slideshow on the Tufts University Department of Computer Science is currently broken at the time of this writing, at least in Safari. Loading the images blocks the loading of the rest of the page until they're complete; the images are cycled through very rapidly (as in, within about a second total) on the initial load, and then they no longer cycle at all. I wonder if this guy wrote it.)

12:00 - Lunch

Sponsored by Source. Sandwiches and wraps. Tons of them. Not bad, pretty standard. Good selection, which is important. And there were plenty left for people to snack on throughout the rest of the day.

1:00 - "OWASP Mobile Top 10 Risks" with Zach Lanier of Intrepidus Group.

Good speaker, good information. The talk was basically covering the proposed OWASP top ten list of mobile security threats, which people are encouraged to evaluate and propose changes. He explained the risks and the reasons behind them very well. I don't know much about mobile development, but this list is exactly the kind of thing that should be posted on the wall next to any mobile developer. Any time you write code that's going to run on a mobile platform, refer back to the list and make sure you're not making a mistake that's been made before.

2:00 - "Don't Be a Victim" with Jack Daniel from Tenable.

This man has an epic beard. So, you know, that's pretty awesome. He's also a fun speaker and makes the audience laugh and all that good stuff. But I don't know, I didn't really like this presentation. It was lacking any real content. To his credit, he warned us about this at the beginning. He told us the guy in the other room has more content and that this talk is going to be very light-hearted and fun. I understand that, I really do. But I think there should at least be some content. Something useful. Otherwise don't bother scheduling a room for this, just make it something off to the side for people to take the occasional break.

The whole talk was through metaphor. I can see what he was trying to get at, but he never wrapped it up. He never brought the metaphor back to something useful. Imagine if Mr. Miyagi never taught Daniel how to fight, he just kept having him paint fences and wax cars. The metaphor still holds, and maybe Daniel will one day understand it, but the lesson kind of sucks. The premise was stretched out to the point of being razor-thin. The entire hour felt like an introduction to an actual presentation.

It was mostly a slideshow of pictures he found on the internet, stories about his non-technical exploits (catching snakes as a kid in Texas, crap like that), references to geek humor, and the occasional reference to the fact that he was wearing a Star Trek uniform shirt during the presentation. Was he just showing off his general knowledge and his geekiness?

Don't get me wrong. He seemed like a great guy. He seemed fun. I'm sure he knows his stuff. I'm sure he has plenty of stories about how he used to wear an onion on his belt. But this seemed like a waste of time.

3:00 - "Binary Instrumentation of Programs" with Robert Cohn of Intel.

This was one of the coolest things I've ever seen. He was demonstrating for us the use of a tool he wrote called Pin which basically edits the instructions of running binaries. He didn't write it for security purposes, but its implications in that field are pretty far-reaching. (Its implications in aspect-oriented programming also came up, which is certainly of more interest to me. Though this is a bit more machine-level than my clients would care to muck with.)

A lot of the talk was over my head when talking about binaries, instruction sets (the guy is a senior engineer at Intel, I'm sure he knows CPU instruction sets like the back of his hand), and so on. But when he was showing some C++ code that uses Pin to inject instructions into running applications, that's where I was captivated. Take any existing program, re-define system calls (like malloc), add pre- and post-instruction commands, etc. Seriously bad-ass.

Like I said, the material doesn't entirely resonate with me. It's a lot closer to the metal than I generally work. But it was definitely impressive and at the very least showed me that even a compiled binary isn't entirely safe. Instructions can be placed in memory from anything. (Granted, I knew this of course, but you'd be surprised how many times a client will think otherwise. That once something is compiled it's effectively sealed and unreadable. This talk makes a great example and demonstration against that kind of thinking.)

4:00 - "Google & Search Hacking" with Francis Brown of Stach & Liu.

Wow. Just... wow. Great speaker, phenomenal material. Most of the time he was mucking around in a tool called Search Diggity. We've all seen Google hacking before, but not like this. In what can almost be described as "MetaSploit style" they aggregated all the useful tools of Google hacking and Bing hacking into one convenient package. And "convenient" doesn't even begin to describe it.

The first thing he demonstrated was hacking into someone's Amazon EC2 instance. In under 20 seconds. He searched for a specific regular expression via Google Code, which found tons of hits. Picking a random one, he was given a publicly available piece of source code (from Google Code or Codeplex of GitHub, etc.) which contained hard-coded authentication values for someone's Amazon EC2 instance. He then logged into their EC2 instance and started looking through their files. One of the files contained authentication information for every administrative user in that company.

Seriously, I can't make this stuff up. The crowd was in awe as he jumped around the internet randomly breaking into things that are just left wide open through sloppy coding practices. People kept asking if this is live, they just couldn't believe it.

One of the questions was kind of a moral one. Someone asked why he would help create something like this when its only use is for exploitation. He covered that very well. That's not its only use. The link above to the tool can also be used to find their "defense" tools as well, which use the same concepts. Together they provide a serious set of tools for someone to test their own domains, monitor the entire internet for exploits to their domains (for example if an employee or a contractor leaks authentication information, this would find it), monitor the entire internet for other leaked sensitive data, etc. By the end of the talk he was showing us an app on his iPhone which watches a filtered feed that their tool produces by monitoring Google/Bing searches and maintaining a database of every exploit it can find on the entire internet. Filter it for the domains you own and you've got an IT manager's dream app.

A great thing about this tool also is that it doesn't just look for direct means of attack. Accidental leaks are far more common and more of a problem. This finds them. He gave one example of a Fortune-100 company that had a gaping security hole that may otherwise have gone unnoticed. The company owned tons of domains within a large IP range, and he was monitoring it. One of the sites he found via Google for that IP range stuck out like a sore thumb. Nothing business-related, but instead was a personal site for a high school reunion for some class from the 70s.

Apparently the CEO of this company (I think he said it was the CEO, it was definitely one of the top execs) was using the company infrastructure to host a page for his high school reunion. Who would have ever noticed that it was in the same IP range as the rest of the company's domains? Google notices, if you report on the data in the right way. Well, apparently this site had a SQL injection vulnerability (found by a Google search for SQL error message text indexed on that site). So, from this tiny unnoticeable little website, this guy was able to exploit that vulnerability and gain domain admin access to the core infrastructure of a Fortune-100 company. (Naturally he reported this and they fixed the problem.)

The demo was incredible. The tools demonstrated were incredible. The information available on the internet is downright frightening. Usually by this time in the day at an event like this people are just hanging around for the raffles and giveaways and are tuning out anything else. This presentation was the perfect way to end the day. It re-vitalized everyone's interest in why we were all at this event in the first place. It got people excited. Everybody should see this presentation. Technical, non-technical, business executives, home users, everybody.

5:00 - Social Time

Sponsored by Safelight. Every attendee got two free beers (or wine, or various soda beverages) while we continued to finish the leftovers from lunch. And not crappy beers either. A small but interesting assortment of decent stuff, including Wachusett Blueberry Ale, which tastes like blueberry pancake syrup but not as heavy.

5:30 - Wrap-Up

OWASP raffled off some random swag, which is always cool. One of the sponsors raffled off an iPad, which is definitely the highlight of the giveaways. For some reason, though, the woman who won it seemed thoroughly unenthused. What the hell? If she doesn't want it, I'll take it. My kids would love an iPad.

6:00 - Expert Panel

Admittedly, I didn't stay for this. I was tired and I wanted to get out of there before everybody was trying to use the bathroom, the elevators, and the one machine available for paying for garage parking. So I left.

All in all, a great little conference. I'm definitely going to this group's future events, and I'd love to work with my employer on developing a strategic focus on application security in our client projects. (Again, I'm no expert. But I can at least try to sell them on the need to hire experts and present it to clients as an additional feature, which also means more billable hours. So it's win-win.)

One thing I couldn't help but notice throughout the event was a constant serious of issues with the projectors. This is my third or fourth conference held at a Microsoft office and this seems to be a running theme. It always takes a few minutes and some tweaking to get projectors in Microsoft offices to work with Windows laptops. Someday (hopefully within a year or two) I'm going to be speaking at one of these local conferences (maybe Code Camp or something of that nature), and I'm going to use my Mac. Or my iPad. Or my iPhone. And it's going to work flawlessly. (Note that one person was using a Mac, and the projector worked fine, but he was using Microsoft PowerPoint on the Mac and that kept failing. I'll be using Keynote, thank you very much.)

Tuesday, October 4, 2011

Disposable Resources in Closures

I've been reading Jon Skeet's C# in Depth and the chapter on delegates left me with an interesting question. (Maybe he answered this question and I missed it, it's kind of a lot to take in at times and I'm sure I'll go back through the book again afterward.)

Within the closure of a delegate, variables can be captured and references to them retained within the scope of the delegate. This can lead to some interesting behavior. Here's an example:

class Program
{
  static void Main(string[] args)
  {
    var del = CreateDelegate();
    del();
    del();
  }

  static MethodInvoker CreateDelegate()
  {
    var x = 0;
    MethodInvoker ret = delegate
    {
      x++;
    };
    return ret;
  }
}

Stepping through this code and calling "x" in the command window demonstrates what's happening here. When you're in Main(), "x" doesn't exist. The command window returns an error saying that it can't find it. But with each call to the delegate, internally there is a reference to a single "x" which increments each time (thus ending with a value of 2 here).

So the closure has captured a reference to "x" and that data remains on the heap so that the delegate can use it later, even though there's no reference to it in scope when the delegate isn't being called. Pretty cool. But what happens if that reference is something more, well, IDisposable? Something like this:

class Program
{
  static void Main(string[] args)
  {
    var del = CreateDelegate();
    del();
    del();
  }

  static MethodInvoker CreateDelegate()
  {
    var x = 0;
    using (var txt = File.OpenWrite(@"C:\Temp\temp.txt"))
    {
      MethodInvoker ret = delegate
      {
        x++;
        txt.Write(Encoding.ASCII.GetBytes(x.ToString()), 0, x.ToString().Length);
      };
      return ret;
    }
  }
}

This looks a little scary, given the behavior we've already seen. Is that "using" block going to dispose of the file handle? When will it dispose of it? Will that reference continue to exist on the heap when it's not in scope?

Testing this produces an interesting result. The reference to "x" continues to work as it did previously. And the reference to "txt" seems to also be maintained. But the file handle is no longer open. It appears that when the CreateDelegate() method returns, that "using" block does properly dispose of the resource. The reference still exists, but the file is now closed and attempting to write to it when the delegate is first called results in the proper exception as a result.

So let's try something a little messier:

class Program
{
  static void Main(string[] args)
  {
    var del = CreateDelegate();
    del();
    del();
  }

  static MethodInvoker CreateDelegate()
  {
    var x = 0;
    var txt = File.OpenWrite(@"C:\Temp\temp.txt");
    MethodInvoker ret = delegate
    {
      x++;
      txt.Write(Encoding.ASCII.GetBytes(x.ToString()), 0, x.ToString().Length);
    };
    return ret;
  }
}

Now we're not disposing of the file handle. Given the previous results, the results of this are no surprise. Once the delegate is created, the file handle is open and is left open. (While stepping through the debugger I'd go back out to the file system and see if I can re-name the file, and indeed it would not let me because it was in use by another process. It wouldn't even let me open it in Notepad.) Each call to the delegate successfully writes to the file.

It's worth noting that the file handle was properly disposed by the framework when the application terminated. But what if this process doesn't terminate in an expected way? What if this is a web app or a Windows service? That file handle can get pretty ugly. It's worth testing those scenarios at a later time, but for now let's just look at what happens when the system fails in some way:

class Program
{
  static void Main(string[] args)
  {
    var del = CreateDelegate();
    del();
    del();
    Environment.FailFast("testing");
  }

  static MethodInvoker CreateDelegate()
  {
    var x = 0;
    var txt = File.OpenWrite(@"C:\Temp\temp.txt");
    MethodInvoker ret = delegate
    {
      x++;
      txt.Write(Encoding.ASCII.GetBytes(x.ToString()), 0, x.ToString().Length);
    };
    return ret;
  }
}

The behavior is the same, including the release of the file handle after the application terminates (which actually surprised me a little, but I'm glad it happened), except for one small difference. No text was written to the file this time. It would appear that the captured file handle in this case doesn't flush the buffer until either it's disposed or some other event causes it to flush. Indeed, this was observed in Windows Explorer as I noticed that the file continued to be 0 bytes in size while the delegate was being called. In this last test, it stayed 0 bytes because it was never written to. In the previous test, it went directly from 0 bytes to 2 bytes when the application exited.

I wonder if anybody has ever fallen into a trap like this in production code. I certainly hope not, but I guess it's possible. Imagine a home-grown logger which just has a delegate to a logging function that writes to a file. That log data (and, indeed, all log data leading up to it) will be lost if it's never properly disposed. And it may not be entirely intuitive to developers working in the system unless they really take a look at that logging code (which they shouldn't have to do, it should just work).

I kind of want to come across something like this in a legacy application someday, if for no other reason than to see what the real-world fallout would look like.