Messaging as a programming model – Revisited

Well I know it hasn’t been long but such was the feedback I received from my Messaging as a programming model blog posts that I thought I would quickly follow up with some revisions and extra background information a) to clarify one or two decisions and b) to incorporate one or two additions or modifications so here goes.

Why are you doing it that way?

Why are you doing processing in the constructor and, while we’re at it, why the lambda in the Register method too?

A few comments brought this to light and really made me stop and think about it. Why did I choose to do it that way? As I sat and thought about it I realised that I hadn’t actively chosen it at as such, I had basically arrived at the first solution and failed to keep on iteratively questioning my own decisions. Sometimes you’re so close to what you’re doing that you forget to step back and look at things objectively. Having a code review performed by 18,000 eyes though soon highlights your deficiencies and I’m thankful for that. I was basically treating the constructor as just a procedure call and whilst it works it’s not good. Taking on board various suggestions (thanks everybody but particularly to Christian Palmstierna and Ralf Westerphal) I went back and refactored it to the better solution (that could no doubt be improved further still) which I present here.

First of all, create an interface that all filters will implement:

public interface IFilter<T>
{
    void Execute(T msg);
}

Change the PipeLine class to store a list of IFilter instances instead of Action of T:

public class PipeLine<T>
{
    private readonly List<IFilter<T>> _filters = new List<IFilter<T>>();

    public PipeLine<T> Register(IFilter<T> filter)
    {
        _filters.Add(filter);
        return this;
    }

    public void Execute(T input)
    {
        _filters.ForEach(f => f.Execute(input));
    }
}

Now the registration of the filters looks like this:

var loginPipeline = new PipeLine<LogInMessage>();

loginPipeline.Register(new CheckUserSuppliedCredentials())
             .Register(new CheckApiKeyIsEnabledForClient(new ClientDetailsDao()))
             .Register(new IsUserLoginAllowed())
             .Register(new ValidateAgainstMembershipApi(new MembershipService()))
             .Register(new GetUserDetails(new UserDetailsDao()));
             .Register(new PublishUserLoggedInEvent(new GetBus()));

The filters themselves implement the interface as the following refactored example from the previous post shows:


public class CheckUserSuppliedCredentials : IFilter<LogInMessage>
{
    public void Excute(LogInMessage input)
    {
        if(string.IsNullOrEmpty(input.Username) || 
                  string.IsNullOrEmpty(input.Password))
        {
            input.Stop = true;
            input.Errors.Add("Invalid credentials");
        }
    }
}

The end result is that the processing is now done in the Execute method instead of the constructor and registration of the filters no longer uses a lambda expression. This not only looks cleaner but is more efficient too.

PipeLine Composition

I had to kick myself for not even thinking of this one. My thanks to again to Ralf Westphal (who has been developing a technique called Flow Design for the last few years) for highlighting it. The question was, Why did I have such a long pipeline for the load file example?

As a reminder this is what it looked like:


load = new Pipeline<PartUploadMessage>();
load.Register(msg => new PreSpreadsheetLoad(msg))
    .Register(msg => new ImportExcelDataSet(msg))
    .Register(msg => new ValidateSheet(msg))
    .Register(msg => new ValidateColumnNames(msg))
    .Register(msg => new StripInvalidChars(msg))
    .Register(msg => new CopyToDataTable(msg))
    .Register(msg => new CreateExtraColumns(msg))
    .Register(msg => new CheckStockItemCodeColumnFormatting(msg))
    .Register(msg => new CheckForDuplicateStockItemCode(msg))
    .Register(msg => new ValidateUniqueClient(msg))
    .Register(msg => new ValidateRequiredFields(msg))
    .Register(msg => new ValidateColumnDataTypes(msg))
    .Register(msg => new ValidateContentOfFields(msg))
    .Register(msg => new ValidateCategories(msg))
    .Register(msg => new PostSpreadsheetLoad(msg));

validate = new Pipeline<PartUploadMessage>();
validate.Register(msg => new PreValidate(msg))
    .Register(msg => new CheckForDuplicateStockItemCode(msg))
    .Register(msg => new ValidateUniqueClient(msg))
    .Register(msg => new ValidateRequiredFields(msg))
    .Register(msg => new ValidateContentOfFields(msg))
    .Register(msg => new ValidateCategories(msg))
    .Register(msg => new PostValidate(msg));

The load pipeline instance reuses validation filters that are registered in the validation pipeline instance but why do that? Why not instead remove them from the load pipeline instance and then compose a new pipeline made up of the two? What a great idea, so simple, and yet I didn't ever consider it. I am indeed an idiot.

The best part is no new code is required just a change in how the pipelines are registered:


load = new Pipeline<PartUploadMessage>();
load.Register(msg => new PreSpreadsheetLoad(msg))
    .Register(msg => new ImportExcelDataSet(msg))
    .Register(msg => new StripInvalidChars(msg))
    .Register(msg => new CopyToDataTable(msg))
    .Register(msg => new CreateExtraColumns(msg))
    .Register(msg => new PostSpreadsheetLoad(msg));

validate = new Pipeline<PartUploadMessage>();
validate.Register(msg => new PreValidate(msg))
    .Register(msg => new ValidateSheet(msg))
    .Register(msg => new ValidateColumnNames(msg))
    .Register(msg => new CheckForDuplicateStockItemCode(msg))
    .Register(msg => new ValidateUniqueClient(msg))
    .Register(msg => new ValidateRequiredFields(msg))
    .Register(msg => new ValidateContentOfFields(msg))
    .Register(msg => new ValidateCategories(msg))
    .Register(msg => new PostValidate(msg));

Thinking about it, in the original implementation the loadPipeline instance had more than one responsibility. It both loaded and validated, but by employing pipeline composition we can now create a new pipeline composed of the other two to achieve the same result:


import = new Pipeline<PartUploadMessage>();
import.Register(load.Execute)
    .Register(validation.Execute);

Now the new import pipeline instance executes both the load and validation pipelines sequentially. That's pretty cool!

Note that if we swap from the original pipeline implementation to the new IFilter implementation then the pipeline itself needs the interface to allow composition because the Register method now takes an IFilter of T instead of Action of T:

public class PipeLine<T> : IFilter<T>
{
    private readonly List<IFilter<T>> _filters = new List<IFilter<T>>();

    public PipeLine<T> Register(IFilter<T> filter)
    {
        _filters.Add(filter);
        return this;
    }

    public void Execute(T input)
    {
        _filters.ForEach(f => f.Execute(input));
    }
}

Our composition now looks like this:


import = new Pipeline<PartUploadMessage>();
import.Register(load)
    .Register(validation);

which is even cleaner. Nice.

But what about immutability?

This I'm not concerned about so much. C# is not a functional language (despite the addition of LINQ and its functional abilities) and therefore, unlike F#, does not support immutability natively. Attempting to do this I feel would be too expensive and not really worth the effort or the extra complexity I feel it would add. The overriding goal in my mind was simply to adopt the idea of an explicit message so one could easily reason about the path taken through the system not to try and bend the language to meet some theoretical purity. In C# it's not even a trade-off, it just doesn't make sense.

Different Strokes

Clearly there are a number of ways the implementation could have been different. Going through this process of questioning myself and allowing others to question me has confirmed that there are of course many ways to skin a cat. As I alluded to early on, I don't think OOP is sufficient on its own as you only have to look at the codebases produced every day by enterprise teams all over. The trouble is despite me saying that, I still struggle to break free from the object oriented shackles myself because they've been rammed into my head for the last twenty years. It's a hard habit to break but one thing that gives me inspiration is this talk by Bret Victor on the future of programming which is given as if the year is 1973 and looks ahead to what programming might look like in 40 years time i.e today! At the end he talks about being open to new ways of thinking and that's my goal here. I'm trying to explore other ways of tackling problems I face at work every day not just churning out the same old service layer, repository, dto approach that frankly I think just fails miserably.

Object Orientation isn't bad per se but I think it becomes cumbersome when used at a "macro" level. In other words, I think projects, applications, codebases, whatever you want to call them are too large. If we broke them down into smaller, more focused components we'd have a better chance of building systems that last and that are easily maintainable and extendable. Smaller components though have to communicate and that's where messaging comes in again. I'm very interested in the Micro-services architecture as I think it has a lot of benefits. If you're interested there's a great talk by Fred George worth watching on this very topic.

Messaging is a different paradigm to object orientation in a number of ways so you do have to think a little differently. I have to remember to tell myself that an application written this way is not an object oriented program in the traditional sense. Just like we know that design patterns have different forces and consequences within OOP, I think the same applies to different programming paradigms. Sticking to the object oriented paradigm seems so natural why would you ever consider anything different? And yet I think messaging is different because it more closely adheres to the "tell, don't ask" principle i.e. one-way communication. The cornerstones of object orientation are still useful of course, but messaging adds something extra with different trade-offs and benefits.

Flow Based Programming

It appears that there is a movement underway to bring something called Flow Based Programming (FBP) to the masses through a KickStarter project called NoFlo which looks very exciting. Until this last week I had never heard of either FBP or NoFlo but at its core the key concept appears to be about message passing between asynchronous components. There is apparently a major Canadian bank who have had a system in production for the best part of 40 years using FBP which truly goes to show that there are no new ideas in software. Clearly there is value in adopting a messaging approach and I think it's another technique worth adding to your toolbox.

Finally...(again!)

Thanks to everyone that got in touch for all the feedback, especially to Ralf for the encouragement by email who has since written a number of posts in response to these articles which I link to here:

Messaging as a programming model - let's get real.
Flows – Visualizing the Messaging Programming Model
Messaging for More Decoupling