SPECjbb 2005 simulates the role a server would play executing the "business logic" in the middle of a three-tier system with clients at the front-end and a database server at the back-end. The logic executed by the test is written in Java and runs in a JVM. This benchmark tests scaling with one to many threads, although its main score is largely a measure of peak throughput.
As you may know, system vendors spend tremendous effort attempting to achieve peak scores in benchmarks like this one, which they then publish via SPEC. We have used a relatively fast JVM, the 64-bit version of Oracle's JRockIt JRE, and we've tuned each system reasonably well. Still, it was not our intention to match the best published scores, a feat we probably couldn't accomplish without access to the IBM JVM, which looks to be the fastest option at present. Similarly, although we've worked to be compliant with the SPEC run rules for this benchmark, we have not done the necessary work to prepare these results for publication via SPEC, nor do we intend to do so. Thus, these scores should be considered experimental, research-mode results only.
We've documented the command-line options used for most of the test systems in our Xeon 5600 review. For the Dell R810, we used the following command line options:
Xeons 16 core/32 thread/128GB/8 instances:
start /AFFINITY [F0000000, 0F000000, 00F00000, 000F0000, 0000F000, 00000F00, 000000F0, 0000000F] JAVAOPTIONS=-Xms3900m -Xmx3900m -Xns3260m -XXaggressive -Xlargepages:exitOnFailure=true -Xgc:genpar -XXgcthreads:8 -XXcallprofiling -XXtlasize:min=4k,preferred=1024k
In keeping with the SPECjbb run rules, we tested at up to twice the optimal number of warehouses per system, with the optimal count being the total number of hardware threads.
In all cases, Windows Server's "lock pages in memory" setting was enabled for the benchmark user. In the Xeon systems' BIOSes, we disabled the "hardware prefetch" and "adjacent cache line prefetch" options.
Our 2P Xeon X7560 system pretty much outclasses the lower-priced options in SPECjbb, with substantially higher throughput in a single box than anything else we've tested. The X7560's performance peaks at eight instances with four warehouses each, or 32 threads, as one might expect. Unlike the Westmere Xeons, though, the X7560 server's performance doesn't drop substantially after moving past that point.