nServiceBus NServiceBus training - opens a new tab
 
  Overview Documentation Downloads Community Roadmap License
 

Load Balancing with the Distributor

The distributor

Similar in behavior to standard load balancers
the NServiceBus Distributor is the key to scaling out
message processing over many machines transparently.

As a standard NServiceBus process, the distributor maintains
all the fault-tolerant and performance characteristics of
NServiceBus but also is designed never to overwhelm any
of the worker nodes configured to receive work from it.

Why use it

When starting to use NServiceBus, you'll see that you can easily run multiple instances of the same process with the same input queue. This may look like scaling-out at first, but is really no different than running multiple threads within the same process. If you try to do this with multiple machines, you'll see that you can't share a single input queue across multiple machines.

The distributor gets around this limitation.

What about MSMQ v4

In version 4 of MSMQ, made available with Vista and Server 2008, there is now the ability to perform 'remote transactional receive' which wasn't possible in previous versions.

What this means is that processes on other machines can transactionally pull work from a queue on a different machine. If the machine processing the message were to crash, the message would roll back to the queue and other machines could then process it.

Even though the distributor provided similar functionality even before Vista was released, there are other reasons to use it even on the newer OS.

The problem with 'remote transactional receive' is that it gets proportionally slower as more worker nodes are added. This is due to the overhead of managing more transactions, as well as the longer period of time that these transactions are open.

In short, the scale-out benefits of MSMQ v4 by itself are quite limited.

How does it work

Worker nodes send messages to the distributor, telling it when they're ready for work. These messages arrive at the distributor via a separate 'control' queue. The distributor stores this information.

When applicative messages arrive at the distributor, it uses the previously stored information to find a free worker node, and sends the message on to it.

If no worker nodes are free, the distributor waits a bit and then repeats the previous step.

This way, all pending work stays in the distributor's queue (rather than building up in each of the workers' queues) giving visibility into how long messages are actually waiting. This is important for complying with time-based service level agreements (SLAs).

For more information on monitoring, see Monitoring NServiceBus Endpoints.

Where is it

You can find the distributor under the /processes/distributor directory in the download. In there you'll see NServiceBus.Host.exe - the generic NServiceBus host process used to run the distributor.

where to find the distributor

If you run the exe by double-clicking on it, you'll see a great deal of debug information. This is standard NServiceBus host behavior. The common use of the distributor in production will be as a Windows Service and not as a console application. In order to see how to install the distributor as a Windows Service, see here.

Configuration

The distributor can be configured via the NServiceBus.Distributor.dll.config file:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appSettings>
    <add key="NumberOfWorkerThreads" value="1"/>
    <add key="ErrorQueue" value="error"/>
    
    <add key="Serialization" value="xml"/>
    <!-- can be either "xml", or "binary" -->
    
    <add key="NameSpace" value="http://www.UdiDahan.com"/> 
    <!-- relevant for a Serialization value of "xml" only -->

    <add key="StorageQueue" value="distributorStorage"/>

    <add key="DataInputQueue" value="distributorDataBus"/>
    <add key="ControlInputQueue" value="distributorControlBus"/>
  </appSettings>
</configuration>

Like any NServiceBus process, you can control the number of threads it runs with the 'NumberOfWorkerThreads' property, and the error queue to which it sends messages it wasn't able to successfully process.

Similar to the way serialization is controlled in code with NServiceBus, the distributor exposes these properties as configuration - the NameSpace value is only applicable when the value of the Serialization property is set to "xml". Administrators should set these values to be the same as those used in the rest of the application.

The StorageQueue property indicates which queue will be used by the distributor to store the state it needs for its operation. This queue should be local - on the same machine as the distributor process.

The last two properties are the most relevant to the use of a distributor by other processes.

Routing with the Distributor

The distributor makes use of two queues for its runtime operation. The DataInputQueue is the queue that client processes will be sending their applicative messages to. The ControlInputQueue is the queue that worker nodes will be sending their control messages to.

Configuring a worker node to work with a distributor is done by setting the values of the following attributes of the UnicastBusConfig section:

  <UnicastBusConfig
    DistributorControlAddress="distributorControlBus@Cluster1"
    DistributorDataAddress="distributorDataBus@Cluster1">

    <MessageEndpointMappings>
      <!-- regular entries -->
    </MessageEndpointMappings>
  </UnicastBusConfig>

Similar to standard NServiceBus routing, you wouldn't want high priority messages to get stuck behind lower priority messages, so just as you would have separate NServiceBus processes for different message types, you would also set up different distributor instances (with separate queues) for different message types.

In this case, you could name the queues just like the messages - for example, SubmitPurchaseOrder.StrategicCustomers.Sales. This would be the name of the distributor's data queue and that of the input queues of each of the workers. The distributor's control queue would best be named with a prefix of 'control' as follows: Control.SubmitPurchaseOrder.StrategicCustomers.Sales.

When using the distributor in a full publish/subscribe deployment, what we'll see is a distributor within each subscriber balancing the load of events being published as follows:

logical pub/sub and physical distribution 3

Keep in mind that the distributor is designed for load balancing within a single site and should not be used between sites. In the image above, all publishers and subscribers are within a single physical site.

For information on using NServiceBus across multiple physical sites, see the gateway.

High Availability

If the distributor goes down, even if its worker nodes remain running they will not receive any messages. Therefore, it is important to run the distributor on a cluster configuring its queues as clustered resources.

Since the distributor doesn't do CPU or memory intensive work, you can often put several distributor processes on the same clustered server. Be aware the network IO may end up being the bottleneck for the distributor so take into account message sizes and throughput when sizing your infrastructure.



NServiceBus is managed and run by Udi Dahan.
Committers include Andreas Öhlund and Matt Burton.

All content on this site is licensed under the Creative Commons Attribution License.