Server benchmarking is an odd domain. There are lots of very well developed, non-proprietary benchmarks available, but there are also teams of engineers with huge resources at companies like Intel, AMD, HP and Sun working to make their products look as good as possible. Stuck in the middle are system engineers and IT managers who don’t really have any good sources for objective and practically usable performance metrics to base design and purchasing decisions on.
We try to cut through that nonsense wherever possible and provide transparent, useful comparisons of new hardware, but that’s especially difficult for server products. Real-world server workloads are complex, and synthetic benchmarks are often not indicative of practical performance. Server CPUs are especially hard. Modern processors are so fast that it takes 100’s of concurrent connections to find the performance ceiling of the CPU, and you’re likely to run into network, memory or other I/O bottlenecks before you do. Since one thing TR doesn’t have is a 200-client test lab, we have to make do with more synthetic means of comparison.
One aspect of server workloads, especially prevalent in web services, is dealing with XML documents. XML has become the data lingua franca spoken between programs running across different hardware platforms, operating systems and programming languages. Parsing, generating and transforming XML text, by themselves, are CPU tasks that rely only on communicating with main memory, so they’re good candidates for creating a synthetic benchmark.
I had originally hoped to find a public XML benchmark that Scott could add to server reviews as-is, but my search didn’t turn up anything usable. The open source XML Benchmark came closest, but wasn’t anything we could readily reduce to a meaningful comparison. Resigned to writing one mostly from scratch, I decided to fill another gap in our lineup, the lack of any benchmark testing performance of code running inside Microsoft’s .NET framework. Microsoft has been making huge strides with ASP.NET, and winning many converts from Java J2EE based development (if you struggled through that wikipedia link, maybe you can start to understand why).
I took the four of the basic units of work in XML benchmark and ported them to C#. After some back-and-forth with Scott we arrived at a framework that allows running a variable number of iterations of each work unit (or a mix of each) across a variable number of threads. The program reports how long it took for all threads to finish, as well as more detailed statistics about the CPU time spent across all threads, and the average runtime of the work units aggregated and broken out by type. We only reported the total start-to-finish time in the Shanghai review because we couldn’t figure out a good way to describe the more granular results.
Because we want our results to be independently reproducible and verifiable, I’m publishing the source code for the XML benchmark program. We’re planning on refining it over time, as well, so please discuss possible improvements in the comments, or email me directly with feedback or patches. I developed it in Visual Studio 2008, but it should also compile in the excellent, and free, Visual C# 2008 Express Edition. The program relies on some test XML files; to maintain comparable results I’ve included the ones we used for the article with the source code.
I haven’t included a license statement, because I’m not a lawyer and I honestly have no idea what license this should fall under. If you want to redistribute it, let me know and we can almost certainly put it under the GPL in an official way.