SPECjbb2005
SPECjbb 2005 simulates the role a server would play executing the "business logic" in the middle of a three-tier system with clients at the front-end and a database server at the back-end. The logic executed by the test is written in Java and runs in a JVM. This benchmark tests scaling with one to many threads, although its main score is largely a measure of peak throughput.

As you may know, system vendors spend tremendous effort attempting to achieve peak scores in benchmarks like this one, which they then publish via SPEC. We did not intend to challenge the best published scores with our results, but we did hope to achieve reasonably optimal tuning for our test systems. We used a fast JVM—the 64-bit version of Oracle's JRockIt JRE P28.0—and picked up some tweaks for tuning from recently published results. We used two JVM instances on all systems (one per socket), with the following command line options:

start /AFFINITY [FC0, 03F] java -Xms3900m -Xmx3900m -Xns3260m -XXaggressive -Xlargepages:exitOnFailure=true -Xgc:genpar -XXgcthreads:6 -XXcallprofiling -XXtlasize:min=4k,preferred=1024k

Those options are specifically the ones used with the Istanbul Opteron system. They varied for the other two systems in a couple of ways. Notice that we used the Windows "start" command to affinitize threads on a per-socket basis. For the Xeon X5550 system with 16 threads, we used masks [FF00, 00FF], and for the Shanghai Opterons, we used [F0,0F]. We also adjusted the number of garbage collector threads (-XXgcthreads) for each JVM to match the number of hardware threads per socket. In keeping with the SPECjbb run rules, we tested at up to twice the optimal number of warehouses per system, with the optimal count being the total number of hardware threads.

In all cases, Windows Server's "lock pages in memory" setting was enabled for the benchmark user. In the X5550 system's BIOS, we disabled the "hardware prefetch" and "adjacent cache line prefetch" options.

Since this is a new round of tests with an updated JVM, we've limited our scope to the three most relevant CPU types.

Even with six cores, the Opteron 2435 can't match the Xeon X5550 in SPECjbb2005. Istanbul does bring substantial progress over Shanghai, however, closing the gap quite a bit. Things become more interesting when we bring power use into the picture, as we're about to do.