Nathan Evans' Nemesis of the Moment

Xamarin/MvvmCross + Autofac

Posted in .NET Framework, Software Design, Xamarin by Nathan B. Evans on February 17, 2014

Recently I’ve been doing some Xamarin development and naturally once a mobile app project reaches a certain size you need to factor it away from just a hacky prototype app towards a sustainable design that ticks all the SOLID boxes.

I researched a handful of various approaches to adopting the MVVM pattern, which included:

  • MvvmCross (OSS, also known as “Mvx”)
  • ReactiveUI (OSS, built on top of the excellent Reactive Extensions / RX library)
  • Crosslight (commercial)

At this time the project didn’t warrant spending $999 on Crosslight (and I only assume it must be very good at that price). So I veered towards an OSS solution. ReactiveUI is by far the more elegantly designed when compared to MvvmCross. However its mobile platform support is relatively new and very focused on solving only the MVVM problem. MvvmCross however is more of a framework that helps you in a number of different ways concerning mobile app development, including providing a dependency injection / IOC layer, and numerous platform-specific extensions for Android, iOS etc. MvvmCross is however, in my opinion, a little “rough around the edges” and most definitely has been put together more like a Panzer Tank than a Lotus Elise.

Ultimately I adopted MvvmCross though as it has the greatest level of momentum in the Xamarin ecosystem and this is important I feel.

One big issue with MvvmCross is that it seems to take on the responsibility of providing a really rather crap IOC container implementation. I’m not sure why it doesn’t just depend upon Autofac or TinyIoc or something like that as it seems like 30% of the code in the MvvmCross codebase could be stripped out if it just farmed out that responsibility to another OSS project. Literally everywhere you look in the MvvmCross codebase there are “factory” and “registry” and, uh, “singleton” components everywhere. Maybe Autofac has spoilt me over the years but I honestly can’t remember the last time I had to “hand roll” such a boilerplate component.

So I set about solving this problem by writing a Autofac adapter for MvvmCross. It turned out to be a lot simpler than I first thought, after working through various nuances of MvvmCross.

AutofacMvxIocProvider.cs

I chose to place this type in a separate assembly intended for all my “Autofac for MvvmCross” related extensions. It is a PCL assembly, since Autofac is fully PCL compatible, even with Xamarin.

public class AutofacMvxIocProvider : MvxSingleton<IMvxIoCProvider>, IMvxIoCProvider {
    private readonly IContainer _container;

    public AutofacMvxIocProvider(IContainer container) {
        if (container == null) throw new ArgumentNullException("container");
        _container = container;
    }

    public bool CanResolve<T>() where T : class {
        return _container.IsRegistered<T>();
    }

    public bool CanResolve(Type type) {
        return _container.IsRegistered(type);
    }

    public T Resolve<T>() where T : class {
        return (T)Resolve(typeof(T));
    }

    public object Resolve(Type type) {
        return _container.Resolve(type);
    }

    public T Create<T>() where T : class {
        return Resolve<T>();
    }

    public object Create(Type type) {
        return Resolve(type);
    }

    public T GetSingleton<T>() where T : class {
        return Resolve<T>();
    }

    public object GetSingleton(Type type) {
        return Resolve(type);
    }

    public bool TryResolve<T>(out T resolved) where T : class {
        return _container.TryResolve(out resolved);
    }

    public bool TryResolve(Type type, out object resolved) {
        return _container.TryResolve(type, out resolved);
    }

    public void RegisterType<TFrom, TTo>()
        where TFrom : class
        where TTo : class, TFrom {

        var cb = new ContainerBuilder();
        cb.RegisterType<TTo>().As<TFrom>().AsSelf();
        cb.Update(_container);
    }

    public void RegisterType(Type tFrom, Type tTo) {
        var cb = new ContainerBuilder();
        cb.RegisterType(tTo).As(tFrom).AsSelf();
        cb.Update(_container);
    }

    public void RegisterSingleton<TInterface>(TInterface theObject) where TInterface : class {
        var cb = new ContainerBuilder();
        cb.RegisterInstance(theObject).As<TInterface>().AsSelf().SingleInstance();
        cb.Update(_container);
    }

    public void RegisterSingleton(Type tInterface, object theObject) {
        var cb = new ContainerBuilder();
        cb.RegisterInstance(theObject).As(tInterface).AsSelf().SingleInstance();
        cb.Update(_container);
    }

    public void RegisterSingleton<TInterface>(Func<TInterface> theConstructor) where TInterface : class {
        var cb = new ContainerBuilder();
        cb.Register(cc => theConstructor()).As<TInterface>().AsSelf().SingleInstance();
        cb.Update(_container);
    }

    public void RegisterSingleton(Type tInterface, Func<object> theConstructor) {
        var cb = new ContainerBuilder();
        cb.Register(cc => theConstructor()).As(tInterface).AsSelf().SingleInstance();
        cb.Update(_container);
    }

    public T IoCConstruct<T>() where T : class {
        return (T)IoCConstruct(typeof(T));
    }

    public object IoCConstruct(Type type) {
        return Resolve(type);
    }

    public void CallbackWhenRegistered<T>(Action action) {
        CallbackWhenRegistered(typeof(T), action);
    }

    public void CallbackWhenRegistered(Type type, Action action) {
        _container.ComponentRegistry.Registered += (sender, args) => {
            if (args.ComponentRegistration.Services.OfType<TypedService>().Any(x => x.ServiceType == type)) {
                action();
            }
        };
    }
}

Setup.cs

I’m just showing my MvxAndroidSetup implementation here, but your iOS and Windows Phone etc would obviously look basically the same.

public class Setup : MvxAndroidSetup {
    private static Assembly CoreAssembly { get { return typeof(App).Assembly; } }

    public Setup(Context applicationContext) : base(applicationContext) { }

    protected override IMvxApplication CreateApp() {
        return new App();
    }

    protected override IMvxIoCProvider CreateIocProvider() {
        var cb = new ContainerBuilder();

        // I like to structure my app using Autofac modules.
        // It keeps everything very DRY and SRP compliant.
        // Ideally, these Autofac modules would be held in a separate PCL so they can be used
        // by Android / iOS / WP platforms without violating DRY.
        cb.RegisterModule<InfrastructureModule>();
        cb.RegisterModule<FeaturesModule>();

        // This is an important step that ensures all the ViewModel's are loaded into the container.
        // Without this, it was observed that MvvmCross wouldn't register them by itself; needs more investigation.
        cb.RegisterAssemblyTypes(CoreAssembly)
            .AssignableTo<MvxViewModel>()
            .As<IMvxViewModel, MvxViewModel>()
            .AsSelf();

        return new AutofacMvxIocProvider(cb.Build());
    }
}

Conclusion

This enables me to use Autofac unhindered from my Xamarin mobile apps. It allows the codebase to remain consistent by only using one IOC container, which helps minimise complexity, encourages more DRY code and in the future would lower the barriers to getting more developers up to speed with the whole codebase. Autofac is by far the best IOC container available for .NET and having it available for using in Xamarin when coupled with MvvmCross provides a major improvement in productivity for me.

IoC containers: where you define the seams of applications

Posted in .NET Framework, Software Design by Nathan B. Evans on April 10, 2013

A few colleagues asked me to do a quick write up about the proper use of a IoC container. Particularly concerning what types you DO and DON’T register into the container. So here we go:

Things that you do and don’t wire up into an IoC container.

The big ones, the seams of the application

Components that are inherently cross-cutting concerns, and need to be “available everywhere” for possible injection. Things like:

  • Logging, tracing and instrumentation
  • Authentication and authorization
  • Configuration
  • Major application services (this includes things like the Controllers in a MVC web app)

Components that will be modularised as plug-ins / add-ins, things that get loaded dynamically. Consider using MEF as the discovery mechanism of these components.

Services with multiple implementations that can be “dynamically selected” through some means (app.config, differing registrations per DEBUG and RELEASE modes at compile-time, per-tenant configuration, etc.)

The little ones, the stylistic ones and where you “lean” on the power of your container to provide infrastructure services or as a development aid

Components that require lifetime scoping or management (transactions, sessions, units of work) and other IDisposable-like things that are longer lived than just a one-off use.

Components that are single instance. Never write “static” components.

Components that require testing / mocking out, etc. Note: I consider this to be a “development aid” and not at all mandatory.

When you want an “automatic factory” (Autofac isn’t called that for no reason!). A simple inline Func<ISomeService> expression is cleaner than a going down the stereotypical Java “Enterprise” route of manually rolling out a SomeServiceFactory class each time. Though that’s more as a result of the sad fact that they still don’t have lambdas.

And now the things that you leave out of the container

Anything that is never, and never has any need to be, referenced outside out of the module it is within.

Implementation details of a module. Your container registrations should be the facade that hides the complexities of how that module works.

Things that are essentially just DTOs, entities, POCOs, other dumb types, etc.

Little utility, helper functions.

Note: I refer to “module” a few times. This is in no way in direct reference to an assembly or package. It’s more in reference to a namespace, because components typically reside within a relatively self contained namespace with a container registration module.

Cardinal rules

Never ever call a “static” Kernel.Get / Resolve, or whatever equivalent your container might expose, anywhere. This is not dependency injection. It is service location. Which is a whole different pattern entirely. Autofac is quite neat in that it’s one of very few containers that actually does not, out of the box, provide any sort of “static” resolution/service location function. And that is good.

Only call Get / Resolve methods in your bootstrap code at the root of your object graph. And even then, there should only be less than a handful of such calls. If you can get it down to just one, then you’ve done well and you probably have an object graph that is very well expressed.

Always keep the object graph in the back of your mind. It’s a shame, in my opinion, that containers tend to keep this information hidden away in their internals. The only time you get a glimpse of it is in the exception message for when you’ve inadvertently introduced a circular dependency. Things could be so much better than this.

If you have a component that’s requiring injection of more than about five dependencies, then it should start coming onto your “radar of suspiciousness”. If it reaches about eight to nine dependencies you should almost certainly consider refactoring it and, probably, the wider namespace or module as a whole. I often see this happen on Controllers in MVC applications; the so called “fat controller” problem. Thankfully, because the dependencies are already “well expressed” (it’s just that there is too many of them) then normally refactoring such problem areas of the codebase is a relatively straight forward task.

Nowhere except your bootstrapper and container modules should reference the container, i.e. its namespaces. Arguably, your bootstrapper and container modules can be in a totally separate assembly by themselves and only that assembly holds references to your container’s assemblies. If you’re seeing namespace imports for your container all over your projects then something is very badly wrong.

Avoid the use of “service locator injections”, such as IComponentContext in Autofac. This is one of the very few ways that Autofac supports to allow you to shoot yourself in the foot. It’s not quite as bad as a “static” Kernel.Get style service locator, but it’s still pretty damn bad. As it implies you don’t actually know what possible dependencies your component has, which should be impossible. To avoid this, express your dependencies better. If there are multiple instances you wish to dynamically “select” from at runtime then you can roll your own resolution delegate function and lean on your container to implement it. Autofac makes this very easy using its IIndex relationship. For example:

public delegate IMyService MyServiceResolver(string name);

// ... this stuff below goes in your container module ...
Func<IComponentContext, MyServiceResolver> tmp = c => {
    var indexedServices = c.Resolve<IIndex<string, IMyService>>();
    return name => indexedServices [name];
};

builder.Register(tmp)
       .As<MyServiceResolver>()

builder.Register(c => new MyService())
       .As<IMyService>()
       .Keyed<IMyService>(serviceDescription.Name);
       // The "keyed on" value is a string in this example.
       // But, usefully, it can be any object including value types such as an enum.

// ... any time I want to resolve a IMyService, I can just do this in a constructor:
class SomeOtherComponent {
    private readonly IMyService myService;
    public SomeOtherComponent(MyServiceResolver myServiceResolver) {
        if (myServiceResolver == null)
            throw new ArgumentNullException("myServiceResolver");
        
        this.myService = myServiceResolver("Fred");
        // Technically this is a form of service location.
        // However, because we have constrained the number of services that
        // can be resolved to a particular *type*; then this does not
        // introduce any bad practices to the codebase.
        //
        // Most importantly, we are not relying on any "static" magic.
        // (Which is the absolute hallmark of truly bad service location.)
        // Nor are we holding any references to the container.
    }
}

Example of a Bootstrapper, Container Module and general structure of your Program Root

This is a little snippet of a relatively well structured IoC server application. I added some relevant comments to it.

public class Program {
    private static IContainer Container { get; set; }
    private static ILog Log { get; set; }
    private static ProgramOptions Options { get; set; }
    private static Lazy<HostEntryPoint> Host { get; set; }

    public static void Main(string[] args) {
        try {
            Options = new ProgramOptions();
            if (!Parser.Default.ParseArguments(args, Options))
                return;

            // Root of the program.
            // Bootstraps the container then resolves two components.
            // One for logging services in the root (this) and the other
            // is the *actual* entry point of the application.
            Container = new Bootstrapper().CreateContainer();
            Log = Container.Resolve<ILog>(TypedParameter.From("boot"));
            Host = Container.Resolve<Lazy<HostEntryPoint>>();

            if (Options.RunAsService)
                RunAsService();
            else
                RunAsConsole();

        } catch (Exception x) {
            // Arguably one of the very few places catching a plain
            // Exception can make sense: at the root of the program.
            Log.FatalException("Unexpected error occurred whilst starting.", x);
            Environment.Exit(100);
        }

        Log.Info("Exiting...");
        Container.Dispose();
        Thread.Sleep(500);
    }
    
    // ... cut for brevity ...
}



internal class Bootstrapper {
    public virtual IEnumerable<IModule> GetModules() {
        yield return new LoggingModule {
            Console = LoggingMode.TraceOrAbove,
            File = LoggingMode.WarningOrAbove,
            Debug = LoggingMode.Off,
            RegisterNetFxTraceListener = true
            // Container Modules are an excellent place to pass in
            // certain configuration/runtime parameters and options.
            // I prefer to "hard code" things like this until there
            // is a *real* need to expose such things to a config file,
            // and hence the user of the application.
        };

        // These modules can be specified in any order.
        // Container will resolve the object graph at
        // build-time not at registration-time.

        yield return new QueuesModule() { ConcurrentReceivers = 4 };
        yield return new DispatchersModule();
        yield return new HciCommandsModule();
        yield return new MefModulesModule();
        yield return new AzureDataModule() { ConnectionString = "<goes here>"};
    }
    
    public virtual void RegisterCore(ContainerBuilder builder) {
        builder.Register(c =>
                         new HostEntryPoint(
                             c.Resolve<MefModuleLogging>(),
                             c.Resolve<IEnumerable<IQueueAgent>>(),
                             c.Resolve<IEnumerable<IDispatcher>>()))
               .SingleInstance();
               
        // As well as typically only ever using constructor injection...
        // I prefer to explicitly define the dependency resolutions here, each time.
        // That is, in my opinion, half the point of IoC. You're doing it to keep
        // very close tabs on your dependency graphs. So it should certainly not
        // be the norm that you let the container resolve them through its automagicness.
        // An exception to this rule is dynamically loaded modules (such as MEF assemblies)
        // where you cannot possibly know, at compile-time, what dependencies are required.
    }

    public IContainer CreateContainer() {
        var builder = new ContainerBuilder();

        foreach (var module in GetModules())
            builder.RegisterModule(module);

        RegisterCore(builder);

        return builder.Build();
    }
}

I’m open to feedback and discussion :)

Three gotchas with the Azure Service Bus

Posted in .NET Framework, Distributed Systems, Software Design by Nathan B. Evans on March 28, 2013

I’ve been writing some fresh code using Azure Service Bus Queues in recent weeks. Overall I’m very impressed. The platform is good, stable and the Client APIs (at least in the form of Microsoft.ServiceBus.dll that I’ve used) is quite modern in design and layout. It’s only slightly annoying that the Client APIs seem to use the old fashioned Begin/End async pattern that was perhaps more in vogue back in the .NET 1.0 to 2.0 days. Why not just return TPL Tasks?

However, there have been a few larger gotchas that I’ve discovered which can quite easily turn into non-trivial problems for a developer to safely work around. These are the sort of problems that can inherently change the way your application is designed.

Deferring messages via Defer()

I’m of the opinion that a Service Bus should take care of message redelivery mechanisms itself. On the most part, Azure Service Bus does this really well. But it supports this slightly bizarre type of return notification called deferral (invoked via a Defer() or BeginDefer() method). This basically sets a flag on the message internally so that it will never be implicitly redelivered by the queue to your application. But the message will fundamentally still exist inside the queue and you can even still Receive() it by asking for it by its SequenceId explicitly. That’s all good and everything but it leaves your application with a bigger problem. Where does it durably store those SequenceId‘s so that it knows what messages it has deferred? Sure you could hold them in-memory, that would be the naive approach and seems to be the approach taken by the majority of Azure books and documentation. But that is, frankly, a ridiculous idea and its insulting that authors in the fault-tolerant distributed systems space can even suggest such rubbish. The second problem is of course what sort of retry strategy do you adopt for that queue of deferred SequenceId‘s. Then you have to think about the transaction costs (i.e. money!) involved of whatever retry strategy you employ. What if your system has deferred hundreds of thousands of millions of messages? Consider that those deferred messages were outbound e-mails and they were being deferred because your mail server is down for 2 hours. If you were to retry those messages every 5 seconds, that is a lot of Service Bus transactions that you’ll get billed for.

One wonders why the Defer() method doesn’t support some sort of time duration or absolute point in time as a parameter that could indicate to the Service Bus when you actually want that message to be redelivered. It would certainly be a great help and I can’t imagine it would require that much work in the back-end for the Azure guys.

So how do you actually solve this problem?

For now, I have completely avoided the use of Defer() in my system. When I need to defer a message I will simply not perform any return notification for the message and I will allow the PeekLock to expire by its own accord (which the Service Bus handles itself). This approach has the following application design side affects:

  • The deferral and retry logic is performed by the Service Bus entirely. My application does not need to worry about such things and the complexities involved.
  • The deferral retry time is constant and is defined at queue description level. It cannot be controlled dynamically on a per message basis.
  • Your queue’s MaxDeliveryCount, LockDuration and DefaultTimeToLive parameters will become inherently coupled and will need to be explicitly controlled.
    (MaxDeliveryCount x LockDuration) will determine how long a message can be retried for and at what interval. If your LockDuration is 1.5 minutes and you want to retry the message for 1 day then MaxDeliveryCount = (1 day / 1.5 minutes) = 960.

This is a good stop-gap measure whilst I am iterating quickly. For small systems it can perhaps even be a permanent solution. But sooner or later it will cause problems for me and will need to be refactored.

I think the key to solving this problem is gaining better understanding over the reason why the message is being deferred in the first place, therefore providing you with more control. In my particular application it can only be caused when for instance an e-mail server is down or unreachable etc. So maybe I need some sort of watchdog in my application that (eventually) detects when the e-mail server is down and then actively stops trying to send messages, and indeed maybe even puts the brakes on actually Receive()‘ing messages from the queue in the first place. For those messages that have been received already then maybe there should be a separate queue called something like “email-outbox-deferred” (note the suffix). Messages queued on this would not actually be the real message but simply a pointer record that points back to the SequenceId of the real one on the “email-outbox” queue. When the watchdog detects that the e-mail server has come back up then it can start opening up the taps again. Firstly it would perform a Receive() loop on the “email-outbox-deferred” queue and attempt to reprocess those messages by following the SequenceId pointers back to the real queue. If it manages to successfully send the e-mail then it can issue a Complete() on both the deferred pointer message and the real message; to entirely remove it from the job queue. Otherwise it can Abandon() them both and the watchdog can start from square one by waiting to gain confidence of the e-mail servers health before retrying again.

The key to this approach is the watchdog. The watchdog must act as a type of valve that can open and close the Receive() loops on the two queues. Without this component you are liable to create long unconstrained loops or even infinite-like loops that will cause you to have potentially massive Service Bus transaction costs on your next bill from Azure.

I believe what I have described here is considered to be a SEDA or “Staged event-driven architecture“. Documentation of this pattern is a bit thin on the ground at the moment. Hopefully this will start to change as enterprise-level PaaS cloud applications gain more and more traction. But if anyone has any good book recommendations… ping me a message.

I’d be interested in learning more about message deferral and retry strategies, so please comment!

Transient fault retry logic is not built into the Client API

Transient faults are those that imply there is probably nothing inherently wrong with your message. It’s just that the Service Bus is perhaps too busy or network conditions dictate that it can’t be handled at this time. Fortunately the Client API includes a nice IsTransient boolean property on every MessagingException. Making good use of this property is harder than it first appears though.

All the Azure documentation that I’ve found makes use of (the rather hideous) Enterprise Library Transient Fault Block pattern. That’s all fine and good. But who honestly wants to be wrapping up every Client API action they do in that? Sure you can abstract it away again by yourself but where does it end?

It seems odd that the Client API doesn’t have this built in. Why when you invoke some operation like Receive() can’t you specify a transient fault retry strategy as an optional parameter? Or hell, why can’t you just specify this retry strategy at a QueueClient level?

I remain hopeful that this is something the Azure guys will fix soon.

Dead lettering is not the end

You may think that once you’ve dead lettered a message that you’ll not need to worry about it again from your application. Wrong.

When you dead letter a message it is actually just moved to a special sub-queue of your queue. If left untouched, it will remain in that sub-queue forever. Forever. Yes, forever. Yes, a memory leak. Eventually this will bring down your application because your queue will run into its memory limit (which can only be a maximum of 5GB). Annoyingly most developers are simply not aware of the dead letter sub-queues existence because it does not show up as a queue on the Server Explorer pane in Visual Studio. Bit of an oversight that one!

Having a human flush this queue out every now and then is not an acceptable solution for most systems. What if your system has a sudden spike in dead letters. Maybe a rogue system was submitting messages to your queues using an old serialization format or something? What if there were millions of these messages? Your application is going to be offline quicker than any human can react. So you need to build this logic into your application itself. This can be done by a watchdog process that keeps track of how many messages are being put onto the dead letter queue and actively ensures it is frequently pruned. This is very much a non-trivial problem.

Alternatively you can avoid the use of dead lettering entirely. This seems drastic but it may not be such a bad idea actually. You should consider if you actually care enough about retaining that carbon-copy of a message to keep it around as-is. Ask yourself whether just some simple and traditional trace/log output of the problem and approximate message content would be sufficient? Dead lettering is inherently a human concept that is analogous to “lost and found” or a “spam folder”. So perhaps with fully automated systems that desire as little human influence or administrative effort as possible then avoiding dead lettering entirely is the best choice.

Tagged with: , , ,

WebSocket servers on Windows Server

Posted in .NET Framework, Software Design, Windows Environment by Nathan B. Evans on December 20, 2011

This is a slight continuation of the previous WebSockets versus REST… fight! post.

Buoyed with enthusiasm of WebSockets, I set about implementing a simple test harness of a WebSockets server in C#.NET using System.Net.HttpListener. Unfortunately, things did not go well. It turns out that HttpListener (and indeed, the underlying HTTP Server API a.k.a. http.sys) cannot be used at all to develop a WebSockets server on current versions of Windows. The http.sys is simply too strict with its policing of what it believes to be correct HTTP protocol.

In an IETF discussion thread, a Microsoft fellow called Stefen Shackow was quoted as saying the following:

The current technical issue for our stack is that the low-level Windows HTTP driver that handles incoming HTTP request (http.sys) does not recognize the current websockets format as having a valid entity body. As you have noted, the lack of a content length header means that http.sys does not make the nonce bytes in the entity available to any upstack callers. That’s part of the work we will need to do to build websockets support into http.sys. Basically we need to tweak http.sys to recognize what is really a non-HTTP request, as an HTTP request.

Implementation-wise this boils down to how strictly a server-side HTTP listener interprets incoming requests as HTTP. For example a server stack that instead treats port 80 as a TCP/IP socket as opposed to an HTTP endpoint can readily do whatever it wants with the websockets initiation request.

For our server-side HTTP stack we do plan to make the necessary changes to support websockets since we want IIS and ASP.NET to handle websockets workloads in the future. We have folks keeping an eye on the websockets spec as it progresses and we do plan to make whatever changes are necessary in the future.

This is a damn shame. As it stands right now, Server 2008/R2 boxes cannot host WebSockets. At least, not whilst sharing ports 80 and 443 with IIS web server. Because, sure, you could always write your WebSocket server to bind to those ports with a raw TCP socket and rewrite a ton of boilerplate HTTP code that http.sys can already do, and then put up with the fact that you can’t share the port with IIS on the same box. This is something that most people, me included, do not want to do.

Obviously it isn’t really anyone’s fault because back in the development time frame of Windows 7 and Server 2008/R2 (between 2006 to 2009) they could not have foreseen the WebSockets standard and the impact it might have on the design of APIs for HTTP servers.

The good thing is that Windows 8 Developer Preview seems to have this covered. According to the MSDN documentation, the HTTP Server API’s HttpSendHttpResponse function supports a new special flag called HTTP_SEND_RESPONSE_FLAG_OPAQUE that seems to suggest it will put the HTTP session into a sort of “dumb mode” whereby you can pass-thru pretty much whatever you want and http.sys won’t interfere:

HTTP_SEND_RESPONSE_FLAG_OPAQUE

Specifies that the request/response is not HTTP compliant and all subsequent bytes should be treated as entity-body. Applications specify this flag when it is accepting a Web Socket upgrade request and informing HTTP.sys to treat the connection data as opaque data.

This flag is only allowed when the StatusCode member of pHttpResponse is 101, switching protocols. HttpSendHttpResponse returns ERROR_INVALID_PARAMETER for all other HTTP response types if this flag is used.

Windows Developer Preview and later: This flag is supported.

Aside from the new System.Net.WebSockets namespace in .NET 4.5, there are also clear indications of this behaviour being exposed in the HttpListener of .NET 4.5 through a new HttpListenerContext.AcceptWebSocketAsync() method. The preliminary documentation seems to suggest that this method will support 2008 R2 and Windows 7. But this is almost certainly a misprint because I have inspected these areas of the .NET 4.5 libraries using Reflector and it is very clear that this is not the case:

The HttpListenerContext.AcceptWebSocketAsync() method directly calls into System.Net.WebSockets.WebSocketHelper (static class) which has a corresponding AcceptWebSocketAsync() method of its own. This method will then call a sanity check method tellingly named EnsureHttpSysSupportsWebSockets() which evaluates an expression containing the words “ComNetOS.IsWin8orLater“. I need say no more.

It seems clear now that Microsoft has chosen not to back port this minor HTTP Server API improvement to Server 2008 R2 / Windows 7. So now we must all hope that Windows Server 8 runs a beta program in tandem with the Windows 8 client, and launches within a month of each other. Otherwise Metro app development possibilities are going to be severely limited whilst we all wait for the Windows Server product to host our WebSocket server applications! Even still, it’s a shame that Server 2008 R2 won’t ever be able to host WebSockets.

It will be interesting if Willie Tarreau (of HA-Proxy fame) will come up with some enhancements in his project that might benefit those determined enough to still want to host (albeit, raw TCP-based) WebSockets on Server 2008 R2.

Tagged with: , ,

WebSockets versus REST… fight!

Posted in .NET Framework, Software Design by Nathan B. Evans on December 16, 2011

Where will WebSockets be on this InfoQ chart, in three years time?

On 8th December 2011, a little known (but growing in awareness) standard called “WebSockets” was upgraded to W3C Candidate Recommendation status. That is one small step shy of becoming a fully ratified web standard. And just to remove any remaining possible doubt: Next-gen platforms and frameworks such as Windows 8 and .NET 4.5 (at three levels: System.Net, WCF and ASP.NET) already have deeply nested support, and they aren’t even beta  yet!

After reading up about the standard in detail and absorbing the various online discussion around it, it is becoming increasingly clear to me that this standard is going to steal a large chunk of mind share from RESTful web services. What I mean is that there will come a stage in product development where somebody will have to ask the question:

Right guys, shall we use WebSockets or REST for this project?

I expect that WebSockets will, within a year or two, begin stunting the growth of RESTful web services – at least as we know them today.

What are WebSockets?

They are an overdue and rather elegant protocol extension for HTTP 1.1 that allows what is fundamentally a bi-directional TCP data stream to be tunnelled over a HTTP session. They provide a built-in implementation of TCP message framing, so developers don’t need to worry about any boilerplate code stuff like that when designing their application protocol.

Why are WebSockets a threat to RESTful web services?

From the last few years of working on projects that expose RESTful web services, I have noticed a few shortcomings. I should probably make clear that I’m not claiming that WebSockets answers all those shortcomings. I’m merely suggesting that REST is not the silver bullet solution that it is often hyped up to be. What I am saying is that there is definitely space for another player that can still operate at “web scale”. WebSockets have more scope to be a little more like a black box or quick’n’dirty solution than REST which requires a more design up-front approach due to versioning and public visibility concerns. Always use the correct tool for the job, as they say.

Sub-par frameworks

They might claim REST support but they still haven’t truly “groked” it yet, in my opinion. WCF REST is a good example of this. Admittedly, the WCF Web API for .NET is starting to get close to where things should be, but it is not yet production ready.

Perhaps even more serious is the lack of widespread cross-platform RESTful clients that work in the way that Roy prescribed; of presenting an entry point resource that allows the client to automatically discover and autonomously navigate between further nested resources in a sort of state machine fashion. A single client framework that can operate with hundreds of totally different RESTful web services from different organisations. This does not exist yet, today. This is why so many big providers of RESTful web services end up seeding their own open source projects in various programming languages to provide the essential REST client.

Enterprise loves SOAP (and other RPCs)

Third-parties that want to use your web services often prefer SOAP over REST. Many haven’t even heard of REST! WebSockets are a message-based protocol allowing for SOAP-like RPC protocols that enterprise seem to adore so much. Hell there’s nothing stopping actual SOAP envelopes being transferred over a WebSocket!

This might not be the case if you’re operating in an extremely leading edge market such as perhaps cloud computing where everyone is speaking the same awesomesauce.

Complex domain models

Mapping out complex domain models onto REST can be slow and labourious. You’ll find yourself constantly having to work around its architectural constraints. Transactions, for example, can be a particular problem. Of course, this is partly related to the first problem (sub-par frameworks) but one cannot reasonably expect transaction support in a REST framework. What is probably needed is a set of common design patterns for mapping domain models to REST. And then an extension library for the framework that provides reusable implementations of those patterns. But alas, none of this exists yet.

Text-based formats

JSON/XML (for reasons unknown) are commonly used with REST and these are of course text-based formats. This is great for interoperability and cross-platform characteristics. But it is not so great for memory and bandwidth usage. This especially has implications on mobile devices.

You’ll find yourself running into walls if you try to use something that isn’t JSON or XML, at least that is my experience with current frameworks.

Request-response architecture of HTTP

Fundamentally, REST is nothing more than a re-specification of the way HTTP works and a proposal of a design pattern to build applications on top of HTTP. This means it retains the same statelessness and sessionless characteristics of HTTP. It therefore precludes REST from being bi-directional where the server could act as the requester of some resource from the client, or sender of some message to the client. As a result it requires “hacks” to be used to emulate server-side events, and these hacks have bad characteristics such as high latency (round trip time) and are wasteful of battery life.

Public visibility, versioning concerns

Sometimes having everything publicly visible is not what you want. People start using APIs that you don’t want them to use yet. You have to design everything to the nth degree much more. Have a proper versioning strategy in place. It encourages a more discerning approach to software development, that is for sure. Whilst these are usually good things, they can be a hindrance on early stage “lean agile” projects.

What can WebSockets do that is so amazing?

The fact that there will soon be a second player in this space suggests that there will be rebalancing of use-cases. WebSockets will prove to be disruptive for several reasons:

True bi-directional capability and server-side events, no hacks

Comet, push technology, long-polling etc in web apps are slow, inefficient, inelegant and have a higher potential magnitude for unreliability. They often work by requesting a resource from the server, causing the server to block until such a time that an event (or events) need to be transferred back to the client. They can be unreliable because the TCP connection could be teared down by a intermediate router during the time it is waiting for the response. Or worse, a proxy server might deliberately  time out the long-running request. As such, many implementations of this hack will use some kind of self-timeout mechanism so that perhaps every 60 seconds they will reissue the request to the server anyway. This has implications on both bandwidth and battery usage.

The true bi-directional capability offered by WebSockets is a first for any HTTP-borne protocol. It is something that neither SOAP nor REST have. And which Comet/push/long-polling can only emulate, inefficiently. The bi-directional capability is inherently so good that you could tunnel a real-time TCP protocol such as Remote Desktop or VNC over a WebSocket, if you wanted.

Great firewall penetration characteristics

WebSockets can tunnel out of heavily firewalled or proxied environments far easier than many other RPC designs. I’m sure I’m not alone in observing that enterprise environments rarely operate their SOAP services on port 80 or 443.

If you can access the web on port 80 without a proxy, WebSockets will work.

If you can access the web on port 80 with a proxy, WebSockets should work as long as the proxy software isn’t in the 1% that are broken and incompatible.

If  you can access the web on port 443 with or without a proxy, WebSockets will work.

I strongly suspect that there will be a whole raft of new Logmein/Remote Desktop and VPN solutions that are built on top of WebSockets, purely because of the great tunnelling characteristics.

Lightweight application protocols and TCP tunnelling

There is the potential for extremely lightweight application protocols, in respect of performance, bandwidth and battery usage. Like REST, the application schema/protocol isn’t defined by the standard; it is left completely wide open. WebSockets can transfer either text strings or binary data. It is clear that the text string support was included to aid in transferring JSON messages to JavaScript engines which lack the concept of byte arrays. Whilst the binary support will be most useful tunnelling TCP streams or for custom RPC implementations. After a WebSocket session is established, the overhead per message can be as small as just two bytes (!). Compare that to REST which has a huge HTTP header to attach to every single request and response.

How will the use-cases of REST change?

I believe that REST will lose a certain degree of its lustre. Project teams will less eagerly adopt it if they can get away with a bare bones WebSocket implementation. REST will probably remain the default choice for projects that need highly visible and cross-platform interoperable web services.

Projects without those requirements will probably opt for WebSockets instead and either run JSON over it, or use a bespoke wire protocol. They will particularly be used by web and mobile applications for their back-end communications i.e. data retrieval and push-events. Windows 8 “Metro” applications will need to use them extensively.

I suppose you could summarise that as:

  • REST will be (and remain) popular for publicly visible interfaces.
  • WebSockets will be popular for private, internal or “limited eyes only” interfaces.

Note: By “public” and “private” I am not referring literally to some form of paid/subscription/membership web service. I am referring to the programming API contract and its level of exposure to eyes outside of your development team or company.

Conclusion

Even though they are competing, the good thing is that REST and WebSockets can actually co-exist with one another. In fact, because they are both built upon HTTP fundamentals they will actually complement each other. A RESTful hypermedia resource could actually refer to a WebSocket as though it were another resource through a ws:// URL. This will pave the way for new RESTful design patterns and framework features. It will allow REST to remedy some of its shortcomings, such as with transaction support; because a WebSocket session could act as the transactional unit of work.

The next year is going to be very interesting on the web.

Building automated two-way applications on top of SMS text messaging

Posted in Software Design by Nathan B. Evans on November 30, 2011

For the past 8 years of my life I have been engrossed in the development of fully automated applications that use two-way SMS text messaging as their communication layer. SMS started life as being nothing more than what was basically the “ICMP protocol” of GSM networks. It used to be fairly hidden away in the menus of those early Nokia phones. And even then it was very much akin to sending a “ICMP ping” message to your friend, and then he pinged you back. I guess that’s where modern services like “PingChat” got their name!

SMS is a very simple protocol; there is only three essential things you need to understand:

  1. It is limited to 160 characters per message, if you use the GSM 03.38 7-bit character set.
  2. It is limited to 70 characters per message, if you use the UCS-2 (a.k.a. Unicode, UTF-16) character set.
  3. Multiple messages can be joined together to form a multi-part message by including a special concatenation header, which eats up 6 or 12 characters (depending on whether you’re using GSM or UCS-2 character set). Most phones these days refer to this concept on their GUI as “pages”.

Unfortunately the protocol is severely handicapped for when it comes to building automated two-way applications, and here’s why:

It does not provide any facility, not even an extension standard or extension point, for performing reply correlation.

What do I mean by “reply correlation”? It is a simple concept. Assume that you send a question to a buddy, and then he responds to you with the answer. One might hope expect that the message containing the answer contains some sort of ID code, token or cookie (hidden away in its header information, of course) that relates it to the original question message. Unfortunately, it does not and this is the problem; SMS does not include any such ID/token/cookie, anywhere. It simply wasn’t included in neither the original standard nor any subsequent revisions or extensions of the standard.

It is not necessarily the creators fault because clearly they couldn’t foresee how ubiquitous SMS would become. But there is evidence that they did recognise and respond to its popularity in the late 1990’s very quickly by publishing new standard extensions that built upon SMS, such as multi-part messages and WAP. So one can only wonder why they didn’t make an extension that would allow replies to be correlated with their original message. And unfortunately the window of opportunity to actually get this sorted out was over a decade ago, so we’re pretty much screwed then and will have to make do with it.

This is a big problem for SMS. It makes the process of building two-way fully automated applications much more difficult. Very very few companies have actually managed the solve the problem, and those that have tend to be very small or operating in niché markets. I don’t understand this at all, because the possibilities and prospects for building two-way SMS applications are absolutely huge, almost endless.

One of my key responsibilities over the last 8 years has been in devising production-ready solutions that work around this problem and this blog post is going to summarise all of them.

An overview of the solution

The key to solving this problem lies with two fields contained within the header information of every SMS message: the source and destination address. Or what I call the “address pairing”.

By using the address pairing in an intelligent way we can find the right compromise for a particular two-way application. Essentially, whenever the application needs to send a question to a mobile phone number it must ensure that no existing question is already outstanding on the same address pairing.

There are several ways that an application can be designed around this basic concept.

Solution #1: My application only has one source address

The application must be designed to “serialise” the transmission of questions. It can use a mutual exclusion mechanism that will prevent itself sending a further question to the same mobile phone number if an outstanding question is still waiting for a response. It can be expected that some characteristics of a “transaction” or “transactional unit of work” would be adopted in the design of the application to model this mutual exclusion concept.

My past implementations of this pattern were based on a database table with a composite primary key between both the “source” and “destination” columns. The application would try to insert the address pairing into this table and, if successful, it continues sending the message. But if the insertion were to fail then it would realise that a question is already outstanding with the mobile phone number, prompting it to give up and retry later. Or rather than retrying later based on some timer mechanism, you might enqueue it as a job somewhere; so that when a response for the outstanding question is received the application can check the queue for further jobs for that address pairing and dequeue/execute the top job.

There is a caveat with this solution however, and it comes as a side affect of “serialising” the questions one after the other. What if it takes days or weeks for the person to respond to the question? The questions that are queued up waiting to acquire a lock on the address pairing are going to get pushed back and back. They could get pushed back so far that the premise of the question has been entirely voided (e.g. an appointment reminder/confirmation).

The solution to this problem is to introduce a further concept of a “timeout” value. This will ensure that any question sent to the mobile phone can only be outstanding for up to a designated time period. You would probably typically set this to around 24-48 hours, but some questions that contain more time sensitive content may use a lower value of between 1-4 hours.

It is important (though not essential) that when implementing the timeout value concept that you use the “Validity Period” field that is available in every outbound SMS message. You should set the validity period to roughly match what your timeout value for that question will be. This will help ensure messaging integrity in the event that, for example, the mobile phone is turned off for a week and when it is turned back on then you don’t want your “expired” questions to be delivered when your back-end application has already timed out the workflow that was running for that question.

Solution #2: My application can have multiple source addresses

The idea is that you would have a relatively large pool of source addresses, perhaps as many as 50 or 100. Your application would, as with Solution #1, maintain some kind of database table or data structure that prevents duplicate address pairings. The application would then have some logic that enables it to “select” a free source address i.e. a source address that is not “in use” for the destination mobile phone number.

It would still be advisable to implement some kind of “timeout” mechanism, as with Solution #1, but the advantage would be that you would be able to have substantially greater timeout periods. Possibly in the order of weeks or months. Really the timeout mechanism here would be acting more as a type of garbage collector, than as a question expedite governor as in Solution #1.

I’ve always considered that this solution is better suited to applications that provide a “shared” or cloud service of some kind. Simply because setting up a large pool of dedicated source addresses for each of your application’s customers is surely going to get painful.

This solution does have the disadvantage that end-users on their mobile phone will be communicating, potentially, with lots of different source addresses even though it is really the same company/application at the other end. It can mess up the user’s normal “texting” experience, it would rob them of their iPhone’s “bubble chat” GUI style of presentation and the ability of perhaps creating a Contact list entry for a regular contact. Obviously there are things you can do to try to minimise this risk, such as always trying to select the source address with the lowest index. But really I think that will just make things worse. At some point you WILL want to send multiple questions to a mobile phone number, and there’s no getting around that fact. If you’ve got a large pool of source addresses then you’re going to want to use them.

Solution #3: My application only has one source address, but I need to send concurrent questions to the same mobile phone

You can’t. Well you can, but I don’t recommend it at all. I tried it once, on an early version of our system, and our customers didn’t like it.

Essentially you combine the concepts detailed in Solution #1 and then rely on some text processing logic in your response handling code. So rather than perhaps phrasing your question like “Are you attending the meeting tomorrow? Reply with A=Yes or B=No.” You’d phrase it as “… Reply with A1=Yes or B1=No”. Notice the “1” digit in there? That’s the key bit. That digit refers to a transaction code that will be used for correlation. My implementation of this basically went from zero to nine, so you could have a total of 10 concurrent questions open with the same mobile number.

I don’t like this solution for the following reasons:

  • Many end-users forget to include the essential digit in their reply. They might reply “A” instead of “A1″. I’ve seen this happen in the wild.
  • Accessing digits on mobile phones when typing a SMS message is often an unintuitive process. Even an iPhone needs you to access a sub-keyboard screen. Blackberry’s need you to hit the ALT key.
  • It prohibits your application from accepting literal text responses. Many users would simply reply “Yes” rather than “A” or “A1″. If they do this, your application would be screwed because it wouldn’t have the essential digit to correlate the reply with the original question. I’ve seen this happen in the wild.
  • It prohibits your application from accepting “freeform” text responses. You might want to send a question like “What is your full name?”. There’s no way you can tag on the end of that a list of options. It simply doesn’t make sense.
  • It reveals implementation details onto the user interface of your application. Not good.
  • It compromises messaging integrity. An end-user might inadvertently reply (or possibly even deliberately!) with an incorrect digit.
  • It requires both the “reply analysis/text processing” and “reply correlation” concerns of your application to be interdependent on each other, when really they should not be – at least not to perform something so simple.

Experimental solutions

On the last bullet point of Solution #3 I suggested that your application’s “reply analysis” and “reply correlation” concerns shouldn’t be linked together. This I believe is true for something as simple as what was described in that solution. However, there is plenty of mileage to be explored in adopting this approach for more advanced designs.

When you send a question with a constrained set of response options such as “Yes, No, Maybe”, you might want to record these as part of your address pairing in the database or data structure (as described in Solution #1). Then if you need to send a further question (to the same mobile phone, whilst the first question is still outstanding) you can check if the set of response options are different. This question might be looking for a “Good, Bad, Ugly” response. In which case there is no conflict, is there? So a lock on the address pairing, based upon those expected response options, can be allowed to be acquired. Obviously this wouldn’t be possible (or at least would have ramifications on your overall design) if you were expecting a “freeform” response.

Another possible avenue to be explored is an area of computer science called “natural language processing“. The idea is that when you ask a question like “What is your name?” then you would prime your NLP engine to be expecting a reply that looks like somebody’s name. Anything that arrives from that mobile phone that doesn’t look like a person’s name can be assumed to not be related to the outstanding question. Obviously if you want to ask a concurrent question like “What is your wife’s name?” then you’re back to square one. Because that would be a conflict and you’d need to serialise the questions as described in Solution #1. This (NLP and SMS applications) is an active area of research for me, so I may blog about it in more detail at a later time.

Conclusion

Solution #1 is the best, for now. It strikes the right level of compromise without sacrificing neither messaging integrity nor user friendliness. If you desperately need to send multiple concurrent questions to a mobile phone then I would suggest that you should rethink your approach. Perhaps logically separating your business departments and/or workflow concerns onto different source addresses would be a solution in this case. That way you can send out an urgent question, perhaps relating a missed bill payment, on a source address that is dedicated for that purpose.

Solution #2 is usable, and I can think of several use-cases. But I feel it is not as good for frequent one-to-one contact between a company and their customers. It has serious disadvantages in user friendliness. It is best suited to a hosted cloud service of some kind, where everyone shares the same pool of source addresses and where contact is expected to be infrequent.

Automated builds and versioning, with Mercurial

Posted in .NET Framework, Automation, Software Design, Source Control by Nathan B. Evans on May 16, 2011

Yes this blog post is about that seemingly perennial need to set-up some sort of automatically incrementing version number system for your latest project.

As it happens the “trigger” for me this time around was not a new project but more as a result of our switch from TFS to Mercurial for our source control. It took some time for Mercurial to “bed in”, but it definitely has now. So then you reach that point where you start asking “Okay, so what else can we do with this VCS to improve the way we work?”.

Our first pass at automated versioning was to simply copy the status quo that worked with TFS. This was rather crude at best. Basically we had a MSBuild task that would increment (based on our own strategy) the AssemblyFileVersionAttribute contained inside the GlobalAssemblyInfo.cs. The usual hoo-har really, involving a simple regular expression etc. This was fine. However we did not really like it because it was, effectively, storing versioning information inside of a file held inside the repository. Separation of concerns and all that. It also caused a small amount of workflow overhead involving merging of named branches – with the occasional albeit easily resolvable conflict. Not a major issue, but not ideal either. Of course, not all projects use named branches. But we do; as they’re totally awesome for maintaining many concurrently supported backward releases.

Versioning strategies

The way I see it, there is only a small number of routes you can go down with project versioning:

  1. Some sort of yyyy.mm.dd timestamp strategy.
    This is great for “forward only” projects that don’t need to maintain supported backward releases. So for web applications, cloud apps etc – this is, I suspect, quite popular. For other projects, it simply doesn’t make sense. Because if you want to release a minor bug fix for a release from over a year ago it wouldn’t make any sense for the version to suddenly jump to the current date. How would you differentiate between major product releases?
  2. The typical major.minor.build.revision strategy.
    The Major and Minor would be “fixed” for each production branch in the repository. And then you’d increment the build and/or revision parts based on some additional strategy.
  3. A DVCS-only strategy where you use the global changeset hash.
    Unfortunately this is of limited use today on .NET projects because both the AssemblyVersionAttribute and AssemblyFileVersionAttribute won’t accept neither a string nor a byte array. Of course there is nothing stopping you coming up with your own Attribute (we called ours DvcsIdentificationAttribute) and including/updating that in your GlobalAssemblyInfo.cs (or equivilent) whenever you run a build. But it is of zero use to the .NET framework itself.
  4. Some sort of hybrid between #1 and #2 (and possibly even #3!).
    This is what we do. We use a major.minor.yymm.revision strategy, to be precise.

We like our #4 hybrid strategy because it brings us the following useful characteristics:

  • It has “fixed” Major and Minor parts. Great for projects with multiple concurrent versions.
  • It contains a cursory Year and Month that can be read from a glance. When chasing down a problem on a customer environment it is simple things like this that can speed up diagnosis times.
  • An incremental Revision part that ensures each build in the same month has a unique index.

So then, how did we implement this strategy on the Microsoft stack and with Mercurial?

Implementation

The key to implementing this strategy is first and foremost with retrieving from the repository the “most recent” tag for the current branch. Originally I had big plans here to go write some .NET library to walk the Mercurial revlog file structure. It would have been a cool project to learn some of the nitty gritty details of how Mercurial works under the hood. Unfortunately, I soon discovered that Mercurial has a template command available that already does what I need. It’s called the “latesttag” template. It’s really simple to use as well, for example:

$ hg parents --template {latesttag}
> v5.4.1105.5-build

There is also a related and potentially useful template called “latesttagdistance“. This will count, as it walks the revlog tree, the number of changesets that it walks past in search for the latest tag. It is possible that you could use this value as the incrementation extent for the Revision part in the strategy.

At this point most of the Mercurial fan base will go off and write a Bash script or Python script to do the job. Unfortunately in .NET land it’s not quite that simple, as we all know. I could have written a, erm, “posh” Powershell script to do it, for sure. But then I have to wire that in to the MSBuild script – which I suspect would be a bit difficult and have all sort of gotchas involved.

So I wrote a couple MSBuild tasks to do it, with the interesting one aptly named as MercurialAssemblyFileVersionUpdate:

public class MercurialAssemblyFileVersionUpdate : AssemblyFileVersionUpdate {

    private Version _latest;

    public override bool Execute() {
        var cmd = new MercurialCommand {
            Repository = Path.GetDirectoryName(BuildEngine.ProjectFileOfTaskNode),
            Arguments = "parents --template {latesttag}"
        };

        if (!cmd.Execute()) {
            Log.LogMessagesFromStream(cmd.StandardOutput, MessageImportance.High);
            Log.LogError("The Mercurial Execution task has encountered an error.");
            return false;
        }

        _latest = ParseOutput(cmd.StandardOutput.ReadToEnd());

        return base.Execute();
    }

    protected override Version GetAssemblyFileVersion() {
        return _latest;
    }

    private Version ParseOutput(string value) {
        return string.IsNullOrEmpty(value) || value.Equals("null", StringComparison.InvariantCultureIgnoreCase)
                   ? base.GetAssemblyFileVersion()
                   : new Version(ParseVersionNumber(value));
    }

    private string ParseVersionNumber(string value) {
        var ver_trim = new Regex(@"(\d+\.\d+\.\d+\.\d+)", RegexOptions.Singleline | RegexOptions.CultureInvariant);

        var m = ver_trim.Match(value);
        if (m.Success)
            return m.Groups[0].Value;

        throw new InvalidOperationException(
            string.Format("The latest tag in the repository ({0}) is not a parsable version number.", value));
    }
}

Click here to view the full source, including a couple dependency classes that you’ll need for the full solution.

With that done, it was just a case of updating our MSBuild script to use the new bits:

<Target Name="incr-version">
  <MercurialExec Arguments='revert --no-backup "$(GlobalAssemblyInfoCsFileName)"' />
  <Message Text="Updating '$(GlobalAssemblyInfoCsFileName)' with new version number..." />
  <MercurialAssemblyFileVersionUpdate FileName="$(GlobalAssemblyInfoCsFileName)">
    <Output TaskParameter="VersionNumber" PropertyName="VersionNumber" />
    <Output TaskParameter="MajorNumber" PropertyName="MajorNumber" />
    <Output TaskParameter="MinorNumber" PropertyName="MinorNumber" />
    <Output TaskParameter="BuildNumber" PropertyName="BuildNumber" />
    <Output TaskParameter="RevisionNumber" PropertyName="RevisionNumber" />
  </MercurialAssemblyFileVersionUpdate></strong>
  <Message Text="Done update to '$(GlobalAssemblyInfoCsFileName)'." />
  <Message Text="Tagging current changeset in local repo." />
  <MercurialExec Arguments="tag --force v$(VersionNumber)-build" />
  <Message Text="Pushing commit to master server." />
  <MercurialExec Arguments='push' />
  <Message Text="All done." />
</Target>

Of course, don’t forget to include your tasks into the script, ala:

<UsingTask AssemblyFile="Build Tools\AcmeCorp.MsBuildTasks.dll"
           TaskName="AcmeCorp.MsBuildTasks.MercurialAssemblyFileVersionUpdate" />

<UsingTask AssemblyFile="Build Tools\AcmeCorp.MsBuildTasks.dll"
           TaskName="AcmeCorp.MsBuildTasks.MercurialExec" />

You’ll notice that the parsing implementation is quite forgiving. It is a regular expression that will extract anything that looks like a parsable System.Version string. This is cool because it means the tags themselves don’t have to be exactly pure as System.Version would need them. You can leave a “v” in front, or add a suffix like “-build”. Whatever your convention is, it just makes things a bit more convenient.

Remaining notes

The implementation above will generate global tags that are revision controlled in the repository. I did consider using “local tags” as those wouldn’t need to be held in the repository at all and could just sit on the automated build server only. However unfortunately the “latesttag” template does not work with local tags, it only appears to work with global tags. It would also of course mean that developers wouldn’t benefit from the tags at all, which would be a shame.

Mercurial stores global tags inside a .hgtags file in the root working directory. The file is revision controlled but only for auditing purposes. It does not matter which version you have in your working directory at any given time.

You may still require a workflow to perform merges between your branches after running a build, in order to propagate the .hgtags file immediately (and not leave it to someone else!). If any merge conflicts arise you should take the changes from both sides as the .hgtags file can be considered “cumulative”.

Related reading

http://kiln.stackexchange.com/questions/2194/best-practice-generating-build-numbers
http://stackoverflow.com/questions/4726257/most-recent-tag-before-tip-in-mercurial
http://www.jaharmi.com/2010/03/24/generate_version_numbers_for_mac_os_x_package_installers_with_mercurial_and_semantic_vers
http://mercurial.selenic.com/wiki/NearestExtension

Tagged with: ,

Memory leaks with an infinite-lifetime instance of MarshalByRefObject

Posted in .NET Framework, Software Design by Nathan B. Evans on April 17, 2011

Recently we discovered an issue with the way our product performs AppDomain sandboxing. We were leaking small amounts of memory, quite badly, from each sandbox and every sandbox operation ever created over the lifetime of the process. After much investigation using CLR Profiler, it transpired that our subclasses of MarshalByRefObject which were using “null” as the return from their InitializeLifetimeService() will cause a permanent reference to be held open by some fairly low level area in the .NET remoting stack, specifically System.Runtime.Remoting.ServerIdentity (which is marked internal). This was preventing any of our object(s) derived from MarshalByRefObject from being garbage collected, and thus the memory footprint would keep growing and growing. Fortunately we run our servers with rather huge page files so it never got to a point where customers were affected, but obviously it was something we needed to fix fairly urgently.

After playing around a lot with all the lifetime services stuff like the ISponsor interface to do sponsorship with nasty hacky timeout values etc (which would have had side-affects for our product, but which we were about to resign ourselves too!) we came across a much better alternative solution in the form of RemotingServices.Disconnect(). Hoorah. Is it just me or does the whole remoting story in .NET need a damn good overhaul? Cross-AppDomain communications deserves something better.

With this discovery, I came up with a useful class that improves MarshalByRefObject  by adding deterministic disposal of both itself and any such nested objects (which is more an implementation detail for us but I’m sure could be useful for anyone).

/// <summary>
/// Enables access to objects across application domain boundaries.
/// This type differs from <see cref="MarshalByRefObject"/> by ensuring that the
/// service lifetime is managed deterministically by the consumer.
/// </summary>
public abstract class CrossAppDomainObject : MarshalByRefObject, IDisposable {

    private bool _disposed; 

    /// <summary>
    /// Gets an enumeration of nested <see cref="MarshalByRefObject"/> objects.
    /// </summary>
    protected virtual IEnumerable<MarshalByRefObject> NestedMarshalByRefObjects {
        get { yield break; }
    }

    ~CrossAppDomainObject() {
        Dispose(false);
    }

    /// <summary>
    /// Disconnects the remoting channel(s) of this object and all nested objects.
    /// </summary>
    private void Disconnect() {
        RemotingServices.Disconnect(this);

        foreach (var tmp in NestedMarshalByRefObjects)
            RemotingServices.Disconnect(tmp);
    }

    public sealed override object InitializeLifetimeService() {
        //
        // Returning null designates an infinite non-expiring lease.
        // We must therefore ensure that RemotingServices.Disconnect() is called when
        // it's no longer needed otherwise there will be a memory leak.
        //
        return null;
    }

    public void Dispose() {
        GC.SuppressFinalize(this);
        Dispose(true);
    }

    protected virtual void Dispose(bool disposing) {
        if (_disposed)
            return;

        Disconnect();
        _disposed = true;
    }

}

It was then just a case of modifying a few of our classes to derive from this instead of MarshalByRefObject and then update a couple other locations in our codebase by ensuring that Dispose() was called during clean-up.

Follow

Get every new post delivered to your Inbox.