Small GigE Switch Benchmark Page (update 10/22/08)

Initially I started posting results for Open-MX over GigE on my Limulus Cluster. I used Netpipe MPI/TCP (2.4) for most of the tests. As Open MX requires Jumbo packets, I noticed that using Jumbo packets actually reduced the throughput! I'm still in the process of collecting data, so I cannot make any definite conclusions. I just noticed that the latest version of OpenMX (0.9.1) can run over standard frame sizes (1500). More results coming soon.

Kernel: 2.6.23
CPU: Intel(R) Core(TM)2 Duo CPU E6550 @ 2.33GHz
Interconnect: Intel 82572EI Gigabit Ethernet PCIe 1X
Switches: SMC 8505T (5 port), SMC 8508T (8 port), SMC GS16 (16 port), and a Cross-over Cable (Note: I tested a 5 port 3com and it would not only negotiate 100BT so it went back. I also tested an 8 port ProCurve and it work similarly to the SMC switch, more tests are needed.)
MPI: LAM, MPICH-MX (Open MX 0.6)

Summary (8505T, 8505T, X-over):

Interesting Comparisons (8505T, 8505T, X-over)

SMC GS16 Switch LAM MTU (note Jumbo frames still reduces throughput, except 3000!)

Mystery Is Solved

Turn off Flow Control! I turned off Flow Control using Ethtool and the jumbo packets at the high end got much better, but the variability got much worse (see results below). Also, the kernel is now 2.6.26.2 (Using Fedora 8 now) More tests are needed as well. Here is the Ethtool sequence I used to turn off Flow Control (check the man page for Ethtool for a full description of options).
# ethtool -a eth2
Pause parameters for eth2:
Autonegotiate:  on
RX:             on
TX:             on

# ethtool -A eth2 autoneg off rx off tx off

# ethtool -a eth2
Pause parameters for eth2:
Autonegotiate:  off
RX:             off
TX:             off
I also set the InterruptThrottleRate to "dynamic". See ~/Documentation/networking/e1000.txt in the kernel source directory for an an explanation of this. From the e1000.txt file:

For situations where low latency is vital such as cluster or grid computing, the algorithm can reduce latency even more when InterruptThrottleRate is set to mode 1. In this mode, which operates the same as mode 3, the InterruptThrottleRate will be increased stepwise to 70000 for traffic in class "Lowest latency".

The sequence below sets the InterruptThrottleRate (which is rx-usecs to Ethtool) to 1 (dynamic)

# ethtool -c eth2
Coalesce parameters for eth2:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 3
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 0
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0

# ethtool -C eth2 rx-usecs 1

# ethtool -c eth2
Coalesce parameters for eth2:
Adaptive RX: off  TX: off
stats-block-usecs: 0
sample-interval: 0
pkt-rate-low: 0
pkt-rate-high: 0

rx-usecs: 1
rx-frames: 0
rx-usecs-irq: 0
rx-frames-irq: 0

tx-usecs: 0
tx-frames: 0
tx-usecs-irq: 0
tx-frames-irq: 0

rx-usecs-low: 0
rx-frame-low: 0
tx-usecs-low: 0
tx-frame-low: 0

rx-usecs-high: 0
rx-frame-high: 0
tx-usecs-high: 0
tx-frame-high: 0
Comments or questions: