LBM Messaging Performance

We often benchmark LBM with simple test programs that send and receive messages as fast as possible. Although this pushes the underlying hardware and OS harder than any real messaging application normally would, it does establish an upper bound on performance attainable when all hardware and OS resources are available to LBM.

LBM performance is very dependent on the hardware capabilities, OS tuning, and application behavior. Hence it is impossible to accurately predict how it will perform in an untested environment.

29West continues to execute performance testing in its own labs, and we have more comprehensive results compiled from testing on both 1GigE and 10GigE networks. If you are interested in obtaining copies of these reports, please email info@29west.com.

Message and Payload Rates vs. Payload Size

These tests were run on a pair of Dell Precision Workstations (model 390n) connected by a Gigabit LAN. Each using Intel™ Core® Duo E6600 processor operating at 2.40 GHz and 2 GB of RAM running Red Hat Enterprise Linux WS v4 for the 64-bit EM64T instruction set along with a Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express. Tests were run with LBM version 3.0.

Figure 5
Figure 5: Message Rate and Payload Rate vs. Payload Size

The graphs shows that CPU power is the limiting factor at small message sizes (left side of graph) and that network bandwidth is the limiting factor at large message sizes (right side). The cross-over point where the bottleneck moves from CPU to network seems to be around a message payload size of about 100 bytes. Of special interest is the effect of using two threads. With small message sizes, the message rate almost doubles (indicative of a CPU-bound activity). At large message sizes, where the network itself is the bottleneck, the second thread provides no improvement at all. (Note that improvement derived from multi-threading would not be seen on a single-CPU machine.)

In absolute numbers, the tested machines can generate and consume over 2,000,000 messages/second for message sizes less than 100 bytes. They can saturate a Gigabit LAN for message sizes over 100 bytes. To go faster, additional NICs would be required at large message sizes while more or faster processors would be required at smaller message sizes.

The receiving machine ran the command:

lbmmrcv -C 2 -R 2

The sending machine ran either:

lbmmsrc -T 1 -S 1 -c perf.conf

or

lbmmsrc -T 2 -S 2 -c perf.conf

The LBM configuration file, perf.conf, contained the following single option.

source implicit_batching_minimum_length 8192

This option tells LBM to batch up to 8192 bytes of messages before sending them.

The programs used to perform these tests are included as part of a 29West Free Software Evaluation. Both pre-compiled binaries and source code are included for all of our example applications.