next up previous
Next: Test Script Up: Dual Processor Nodes for Previous: Results

Discussion

Before we make any general conclusions, keep in mind that the performance of your application(s) will certainly vary from those tested above. Indeed, virtually all performance issues stem from the specific nature of the application run on the cluster. It is therefore important that you test your application before committing to specific hardware design.

The results are very interesting. For the GNU compiler suite we see a range of 1.1 to 1.99 with an average speedup of 1.52. The results for the Intel compiler follow these trends and range from 1.05 to 2.00 with and average speed-up of 1.38. Neglecting the difference in compilers we see that the best average speed-up is barely over 1.5 times. In the case of CG there is no benefit to running two copies of the program on a dual SMP node. In the case of EP, two copies run virtually as fast as one copy indicating perfect speed-up. Or first general conclusion is:

In general, the Intel compiler produced lower speed-up for most tests. This result is attributed to better optimization of the CPU by the Intel compilers which then increases the contention for memory access. This result leads to our second general conclusion:

While the addition of a second CPU seems minimal from a cost perspective, it may lead to false sense of efficiency for the cluster. Indeed, the tests indicate that a program that normally takes 20 minutes on a single CPU node, may take as long as 40 minutes on an active dual CPU node. This situation may be further compounded by the fact that the batch scheduler may place a program on the first or second processor of any node and thus provide the program with a very heterogeneous memory contention environment. (i.e. some nodes may have large memory contention and other may have none.)

In these tests, we have not considered communication issues or the mix of different programs on the same node. These issues will be addressed in upcoming reports. In addition, we have run the tests on single hardware platform with two compilers.


next up previous
Next: Test Script Up: Dual Processor Nodes for Previous: Results
Douglas Eadline 2003-03-24