Choosing the Right Transport Protocol

By: Mark Mahowald

Copyright 2009, 29West, Inc.

May 1, 2009

Abstract

The messaging market has benefitted from a lot of innovation over the last 5 years. With new technologies like 10-gigabit Ethernet and InfiniBand, kernel bypass, and fast, inexpensive multi-core servers being deployed, there are many permutations to consider. Then you have to consider the requirements of your messaging application: LAN or WAN delivery, target message rates, desired messaging behavior (receiver paced- or source-paced delivery), the fan-out (number of receivers that want the same data), and the delivery model (queuing, persistence, or streaming) among many others. The choice of transport protocol can have a large effect on performance. This paper explores some of the tradeoffs.

Messaging Architectures: Things to Consider

To a Man with a Hammer, Everything is a Nail

Some messaging products only support one transport protocol, for example, only TCP or only UDP multicast. In these designs, you are forced to live with the limitations of that one choice and design around them. In an ideal world, developers would have access to a messaging product that supported a wide variety of transport protocols, each well-suited for different types of applications that he or she could design. For example, shared memory transports for the lowest-possible latency communication within one machine, UDP multicast for low-latency high fan-out, and TCP or UDP unicast for applications where unicast addressing is required.

It would also be useful to be able to leverage all these transport protocols through the same messaging API, to be able to fully monitor all underlying behavior, and to use any network technology and topology desired.

Latency Matters Everywhere

If you want to see where the latency is in any messaging solution, you need only follow the data path of the messages end-to-end. The typical path, and total latency is:

1)    The latency leaving the sender and getting to the wire. This typically includes the kernel and network stack, but kernel bypass and DMA technology can reduce the latency in this step.

2)    The latency through any messaging intermediaries, daemons, servers, hardware messaging appliances, etc., many of which will copy data en route.

3)    The latency up the network stack on the receiver side. Just like on the sender, this can be reduced by kernel and stack bypass technology.

To minimize latency, you want to remove as many of these steps as possible.

The Right Tool for the Job

In this section, we take a closer look at the characteristics of each transport protocol and where each has its own unique strengths.

Shared Memory or IPC Transports

The shared memory transport will be hard to beat on overall latency since the network latency and all intermediate steps have been removed. Developers using 29West’s IPC transport can achieve latencies as low as 3 µsecs between applications on the same machine, and can publish the same data, using the same API, via TCP or UDP multicast to reach LAN or WAN based receivers.

Good Old TCP

TCP has many advantages as a messaging transport protocol, but a few drawbacks if latency, scalability, or fairness are concerns. Some of the key advantages and disadvantages for its use in messaging are:

     Goes everywhere: The ability to be seamlessly routed through a wide range of LAN/WAN and firewall designs.

     Receiver paced: TCP slows down as the receiver backs up, so you do not overrun the receiver (you slow down the source, add latency, and may encounter other issues, but you do not overrun the receiver). It is this feature that makes it a natural for raw throughput testing. Just turn the sender to full output and see what you get. It is self-throttling. 29West has quoted throughput numbers based on TCP delivery for this reason. The tests are easy to run, and we see over 2.4 million messages per second using TCP and low-end commodity hardware.

     “Polite” operation: In the face of network congestion, TCP backs off, adding latency of course, to allow other traffic to flow. This is great when “fairness” of network use is the goal. However, in many trading environments, it may not be desirable to see quotes delayed to allow some non-critical application to do a file transfer.

     Fan-out 1: TCP is a point-to-point protocol that provides its own recovery logic for lost traffic and in-order message delivery. It is limited to one-to-one communication, so one-to-many data streams have to be replicated either in software at the source and sent many times, or by some intermediary application. In either case, replication outside the network means added load on the network as messages are sent multiple times, and increases latency at the replication points.

     Fairness: Unlike UDP multicast, in the TCP multiple delivery case, multiple sends are required and they must be done in some order. The last receiver to have its copy sent is at an inherent disadvantage relative to the other receivers. Some receiver gets the message first and some receiver gets the message last. TCP generally isn’t the best choice when business requirements dictate fairness among receivers.

Reliable Unicast UDP Transport

If every OS already has TCP built in, why would you consider using a unicast UDP based transport protocol? Reasons include avoiding “fairness” policies and providing more control.

There are many use cases where you would like to see certain traffic jump in front of TCP traffic on a congested network link. Point-to-point UDP protocols will do that. 29West has seen this to be effective in gateway-style applications where you want traffic to traverse a WAN link with higher priority over competing traffic.

Another advantage of unicast UDP is control. If you want more control over the delivery order, want to limit send rates, etc., then a well-designed UDP unicast transport protocol may be well suited for you.

Reliable Multicast UDP Transport

Reliable multicast has been the gold standard in high performance messaging for many years where there is a need to deliver copies of one message to many recipients. It has been used in financial markets since 1985, and though its power requires some care to deploy properly, the win in latency and throughput can be incredible. What are some of the key strengths and weaknesses of multicast?

     Fairness: True one-to-many delivery with fairness across all receivers, minimal network and application loads, and the lowest latency. In a well-designed reliable multicast system, all receivers get data at the same time since switches will replicate it automatically at wire-speed.

     Network Traffic: The network load is minimized as the same message is not being sent many (sometimes hundreds) of times by the source or an intermediary, and therefore network input traffic is greatly reduced. This can reduce the need to upgrade network infrastructure for bandwidth reasons, and helps save on deployment costs.

     Network Copying: Network replication inherent with multicast means there are no intermediate latency points (like daemons or servers) and this greatly increases scaling and reduces latency.

In cases where all receivers are fast and they all want every message sent, it’s hard to argue against a UDP multicast transport–there is simply no comparison. But what about cases when not every receiver wants every message, or where there are receivers who cannot keep up? Well, that brings us to the tradeoffs.

     Source Pacing: UDP multicast is inherently source paced. There is no protocol level back-pressure (“flow control”) to slow down sources when receivers are slow.

     Reliability: UDP multicast does not have built in reliability, so some recovery protocol has to be provided if lossless delivery is required.

These are the main reasons that a good reliable multicast transport protocol is necessary to properly leverage UDP multicast.

There is no single solution that’s right for every case where you have a very fast sender and a very slow receiver. Versatile messaging products can give you many tools, but in all cases, you have a design decision to make: Do I want to slow all receivers down to the pace of the slowest, or do I want the slow receiver to lose data, reset and rejoin the stream? With a powerful tool like multicast, you want to be sure you have provided a means to handle loss recovery and slow receivers.

It is important to note that these same issues exist with TCP delivery, however the developer’s ability to control behavior and design tradeoffs is severely reduced. TCP will force a slow down (latency) in data delivery, whether or not you would rather have the transport throw away old data.

Another design issue for multicast transport protocols is loss recovery. Poorly designed reliable multicast protocols can have problems with retransmission storms or negative acknowledgement (NAK) storms. These problems are mainly associated with legacy daemon-based designs that were prone to degradation under heavy packet loss and associated NAK traffic. Modern application-to-application designs like 29West pioneered in 2003, provide detailed monitoring information, rate limits on both new and retransmitted data, flexible NAK timer intervals and recovery policies. These prevent both spike-induced overruns and “crybaby receivers” from degrading the network.

There is a good deal of detail on this that is outside the scope of this paper, but in more than 120 production use cases worldwide to date, even though we support all the transports described in this paper, the vast majority of our customers deploy our reliable multicast in key portions of their networks for the stability and efficiency it provides.

What is the Right Choice?

A fanatical devotion to any single transport protocol or delivery model can never meet the diverse needs of a large enterprise. We feel the winning answer is ensuring a low total cost of ownership by leveraging your existing infrastructure and providing the best performance, stability and control.

We believe that the right choice is best left to the application designer. By providing all transport protocols seamlessly within 29West’s powerful and flexible API, we let you pick the transport that best meets your needs and network design.

To us, enterprise strength, enterprise wide messaging means:

     Mature products that minimize latency by removing all intermediaries

     Powerful multi-transport, multi-platform APIs (C, .NET, Java)

     Multi-paradigm delivery models: streaming (IPC and LAN/WAN), persistence, queuing, TCP fan-out and caching

     The ability to leverage any network technology and topology

     Detailed network and application level monitoring so you can be sure you are getting the performance out of your system that you designed into it

29West: Enterprise Strength, Enterprise Wide Messaging. We look forward to the opportunity to answer any questions you may have.

If you would like to learn more about 29West Messaging, please visit http://www.29West.com/, or contact us via e-mail at info@29West.com. We have offices in Chicago, New York, London and Tokyo and would welcome a chance to discuss your needs and see how 29West can help.