Enterprise Integration Zone is brought to you in partnership with:

Mike Hadlow is a Brighton, UK based developer, blogger and author of a number of open source frameworks and applications. Mike is a DZone MVB and is not an employee of DZone and has posted 88 posts at DZone. You can read more from them at their website. View Full User Profile

RabbitMQ, Subscription, and Bouncing Servers in EasyNetQ

01.01.2013
| 3603 views |
  • submit to reddit
If you are a regular reader of my blog, you’ll know that I’m currently working on a .NET friendly API for RabbitMQ, EasyNetQ. EasyNetQ is opinionated software. It takes away much of the complexity of AMQP and replaces it with a simple interface that relies on the .NET type system for routing messages.

One of the things that I want to remove from the ‘application space’ and push down into the API is all the plumbing for reporting and handling error conditions. One side of this to provide infrastructure to record and handle exceptions thrown by applications that use EasyNetQ. I’ll be covering this in a future post. The other consideration, and the one I want to address in this post, is how EasyNetQ should gracefully handle network connection or server failure.

The Fallacies of Distributed Computing tell us that, no matter how reliable RabbitMQ and the Erlang platform might be, there will still be times when a RabbitMQ server will go away for whatever reason.

One of the challenges of programming against a messaging system as compared with a relational database, is the length of time that the application holds connections open. A typical database connection is opened, some operation is run over it – select, insert, update, etc – and then it’s closed. Messaging system subscriptions, however, require that the client, or subscriber, holds an open connection for the lifetime of the application.

If you simply program against the low level C# AMQP API provided by RabbitHQ to create a simple subscription, you’ll notice that after a RabbitMQ server bounce, the subscription no longer works. This is because the channel you opened to subscribe to the queue, and the consumption loops attached to them, are no longer valid. You need to detect the closed channel and then attempt to rebuild the subscription once the server is available again.

The excellent RabbitMQ in Action by Videla and Williams describes how to do this in chapter 6, ‘Writing code that survives failure’. Here’s their Python code example:

rabbit_mq_in_action_failure_detecting_subscriber

EasyNetQ needs to do something similar, but as a generic solution so that all subscribers automatically get re-subscribed after a server bounce.

Here’s how it works.

Firstly, all subscriptions are created in a closure:

public void Subscribe<T>(string subscriptionId, Action<T> onMessage)
{
    if (onMessage == null)
    {
        throw new ArgumentNullException("onMessage");
    }

    var typeName = serializeType(typeof(T));
    var subscriptionQueue = string.Format("{0}_{1}", subscriptionId, typeName);

    Action subscribeAction = () =>
    {
        var channel = connection.CreateModel();
        DeclarePublishExchange(channel, typeName);

        var queue = channel.QueueDeclare(
            subscriptionQueue,  // queue
            true,               // durable
            false,              // exclusive
            false,              // autoDelete
            null);              // arguments

        channel.QueueBind(queue, typeName, typeName);  

        var consumer = consumerFactory.CreateConsumer(channel, 
            (consumerTag, deliveryTag, redelivered, exchange, routingKey, properties, body) =>
            {
                var message = serializer.BytesToMessage<T>(body);
                onMessage(message);
            });

        channel.BasicConsume(
            subscriptionQueue,      // queue
            true,                   // noAck 
            consumer.ConsumerTag,   // consumerTag
            consumer);              // consumer
    };

    connection.AddSubscriptionAction(subscribeAction);
}

The connection.AddSubscriptionAction(subscribeAction) line passes the closure to a PersistentConnection class that wraps an AMQP connection and provides all the disconnect detection and re-subscription code. Here’s AddSubscriptionAction:

public void AddSubscriptionAction(Action subscriptionAction)
{
    if (IsConnected) subscriptionAction();
    subscribeActions.Add(subscriptionAction);
}

If there’s an open connection, it runs the subscription straight away. It also stores the subscription closure in a List<Action>.

When the connection gets closed for whatever reason, the AMQP ConnectionShutdown event fires which runs the OnConnectionShutdown method:

void OnConnectionShutdown(IConnection _, ShutdownEventArgs reason)
{
    if (disposed) return;
    if (Disconnected != null) Disconnected();

    Thread.Sleep(100);
    TryToConnect();
}

We wait for a little while, and then try to reconnect:

void TryToConnect()
{
    ThreadPool.QueueUserWorkItem(state =>
    {
        while (connection == null || !connection.IsOpen)
        {
            try
            {
                connection = connectionFactory.CreateConnection();
                connection.ConnectionShutdown += OnConnectionShutdown;

                if (Connected != null) Connected();
            }
            catch (RabbitMQ.Client.Exceptions.BrokerUnreachableException)
            {
                Thread.Sleep(100);
            }
        }
        foreach (var subscribeAction in subscribeActions)
        {
            subscribeAction();
        }
    });
}

This spins up a thread that simply loops trying to connect back to the server. Once the connection is established, it runs all the stored subscribe closures (subscribeActions).

In my tests, this solution has worked very nicely. My clients automatically re-subscribe to the same queues and continue to receive messages. One of the main motivations to writing this post, however, was to try and elicit feedback, so if you’ve used RabbitMQ with .NET, I’d love to hear about your experiences and especially any comments about my code or how you solved this problem.

The EasyNetQ code is up on GitHub. It’s still very early days and is in no way production ready. You have been warned.



Published at DZone with permission of Mike Hadlow, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)