I told myself I'd try to keep pace with any developments across the web related to our frame-latency-based game benchmarking methods, but I've once again fallen behind. That's good, in a way, because there's lots going on. Let me try to catch you up on the latest with a series of hard-hitting bullet points, not necessarily in the order of importance.
Also, a word on words. Although I'm reading a Google translation, I can see that they used the word "microstuttering" to describe the frame latency issues on the Radeon. For what it's worth, I prefer to reserve the term "microstuttering" for the peculiar sort of problem often encountered in multi-GPU setups where frame times oscillate in a tight, alternating pattern. That, to me, is "jitter," too. Intermittent latency spikes are problematic, of course, but aren't necessarily microstuttering. I expect to fail in enforcing this preference anywhere beyond TR, of course.
The colored overlays that track frame delivery are nifty, but I'm pleased to see Ryan looking at frame contents rather than just frame delivery, because what matters to animation isn't just the regularity with which frames arrive at the display. The content of those frames is vital, too. As Andrew Lauritzen noted in his B3D post, disrupted timing in the game engine can interrupt animation fluidity even if buffering manages to keep frames arriving at the display at regular intervals.
Those folks who are still wary of using Fraps because it writes a timestamp at single point in the process will want to chew on the implications of that statement for a while. Another implication: we'll perhaps always need to supplement any quantitative results with qualitative analysis in order to paint the whole picture. So... this changes nothing!
Although it may be confusing to some folks, we will probably keep talking about frame rendering in terms of latency, just as we do with input lag. That's because I continue to believe game performance is fundamentally a frame-latency-based problem. We just need to remember which type of latency is which—and that frame latency is just a subset of the overall input-response chain.
That's all for now, folks. More when it happens.As the second turns: the web digests our game testing methods
A funny thing happened over the holidays. We went into the break right after our Radeon vs. GeForce rematch and follow-up articles had caused a bit of a stir. Also, our high-speed video had helped to illustrate the problems we'd identified with smooth animation, particularly on the Radeon HD 7950. All of this activity brought new attention to the frame latency-focused game benchmark methods we proposed in my "Inside the second" article over a year ago and have been refining since.
As we were busy engaging in the holiday rituals of overeating and profound regret, a number of folks across the web were spending their spare time thinking about latency-focused game testing, believe it or not. We're happy to see folks seriously considering this issue, and as you might expect, we're learning from their contributions. I'd like to highlight several of them here.
Perhaps the most notable of these contributions comes from Andrew Lauritzen, a Tech Lead at Intel. According to his home page, Andrew works "with game developers and researchers to improve the algorithms, APIs and hardware used for real-time rendering." He also occasionally chides me on Twitter. Andrew wrote up a post at Beyond3D titled "On TechReport's frame latency measurement and why gamers should care." The main thrust of his argument is to support our latency-focused testing methods and to explain the need for them in his own words. I think he makes that case well.
Uniquely, though, he also addresses one of the trickier aspects of latency-focused benchmarking: how the graphics pipeline works and how the tool that we've been using to measure latencies, Fraps, fits into it.
As we noted here, Fraps simply writes a timestamp at a certain point in the frame production pipeline, multiple stages before that frame is output to the display. Many things, both good and bad, can happen between the hand-off of the frame from the game engine and the final display of the image on the monitor. For this reason, we've been skittish about using Fraps-based frame-time measurements with multi-GPU solutions, especially those that claim to include frame metering, as we explained in our GTX 690 review. We've proceeded to use Fraps in our single-GPU testing because, although its measurements may not be a perfect reflection of what happens at the display output, we think they are a better, more precise indication of in-game animation smoothness than averaging FPS over time.
Andrew addresses this question in some depth. I won't reproduce his explanation here, which is worth reading in its entirety and covers the issues of pipelining, buffering, and CPU/driver-GPU interactions. Interestingly, Andrew believes that in the case of latency spikes, buffered solutions may produce smooth frame delivery to the display. However, even if that's the case, the timing of the underlying animation is disrupted, which is just as bad:
This sort of "jump ahead, then slow down" jitter is extremely visible to our eyes, and demonstrated well by Scott's follow-up video using a high speed camera. Note that what you are seeing are likely not changes in frame delivery to the display, but precisely the affect of the game adjusting how far it steps the simulation in time each frame. . . . A spike anywhere in the pipeline will cause the game to adjust the simulation time, which is pretty much guaranteed to produce jittery output. This is true even if frame delivery to the display (i.e. rendering pipeline output) remains buffered and consistent. i.e. it is never okay to see spikey output in frame latency graphs.
Disruptions in the timing of the game simulation, he argues, are precisely what we want to avoid in order to ensure smooth gameplay—and Fraps writes its timestamps at a critical point in the process:
Games measure the throughput of the pipeline via timing the back-pressure on the submission queue. The number they use to update their simulations is effectively what FRAPS measures as well.
In other words, if Fraps captures a latency spike, the game's simulation engine likely sees the same thing, with the result being disrupted timing and less-than-smooth animation.
There's more to Andrew's argument, but his insights about the way game engines interact with the DirectX API, right at the point where Fraps captures its data points, are very welcome. I hope they'll help persuade folks who might have been unsure about latency-focused testing methods to give them a try. Andrew concludes that "If what we ultimately care about is smooth gameplay, gamers should be demanding frame latency measurements instead of throughput from all benchmarking sites."
With impeccable timing, then, Mark at AlienBabelTech has just published an article that asks the question: "Is Fraps a good tool?" He attempts to answer the question by comparing the frame times recorded by Fraps to those recorded by the tools embedded in several game engines. You can see Mark's plots of the results for yourself, but the essence of his findings is that the game engine and Fraps output are "so similar as to convey approximately the same information." He also finds that the results capture a sense of the fluidity of the animation. The frame time plot "fits very well in with the experience of watching this benchmark – a small chug at the beginning, then it settles down until the scene changes and lighting comes into play – smoothness alternates with slight jitter until we reach the last scene that settles down nicely."
With the usefulness of Fraps and frame-time measurements established, Mark says his next step will be to test a GeForce GTX 680 and a Radeon HD 7970 against each other, complete with high-speed video comparisons. We look forward to his follow-up article.
Speaking of follow-up, I know many of you are wondering how AMD plans to address the frame latency issues we've identified in several newer games. We have been working with AMD, most recently running another quick set of tests right before Christmas with the latest Catalyst 12.11 beta and CAP update, just to ensure the problems we saw weren't already resolved in a newer driver build. We haven't heard much back yet, but we noticed in the B3D thread that AMD's David Baumann says the causes of latency spikes are many—and he offers word of an impending fix for Borderlands 2:
There is no one single thing for, its all over the place - the app, the driver, allocations of memory, CPU thread priorities, etc., etc. I believe some of the latency with BL2 was, in fact, simply due to the size of one of the buffers; a tweak to is has improved it significantly (a CAP is in the works).
This news bolsters our sense that the 7950's performance issues were due to software optimization shortfalls. We saw spiky frame time plots with BL2 both in our desktop testing and in Cyril's look at the Radeon HD 8790M, so we're pleased to see that a fix could be here soon via a simple CAP update.
Meanwhile, if you'd like to try your hand at latency-focused game testing, you may want to know about an open-source tool inspired by our work and created by Lindsay Bigelow. FRAFS Benchmark Viewer parses and graphs the frame time data output by Fraps. I have to admit, I haven't tried it myself yet since our own internal tools are comfortingly familiar, but this program may be helpful to those whose Excel-fu is a little weak.
Finally, we have a bit of a debate to share with you. James Prior from Rage3D was making some noises on Twitter about a "problem" with our latency-focused testing methods, and he eventually found the time to write me an email with his thoughts. I replied, he replied, and we had a a nice discussion. James has kindly agreed to the publication of our exchange, so I thought I'd share it with you. It's a bit lengthy and incredibly nerdy, so do what you will with it.
Here's is James's initial email:
Alrighty, had some time to play with it and get some thoughts together. First of all, not knocking what you're doing - I think it's a good thing. When I said 'theres a big flaw' here's what I'm thinking.
When I look at inside the second, the data presentation doesn't lend itself to supporting some of the conclusions. This is not because you're wrong but because I'm not sure of the connection between the two. Having played around with looking at 99% time, I think that it's not a meaningful metric in gauging smoothness of itself, it shows uneven render time but not the impact of that on game experience, which was the whole point. It's another way of doing 'X number is better than Y number'.
I agree with you that a smoothness metric is needed. I concur with your thoughts about FPS rates not being the be-all end-all, and 60fps vsync isn't the holy grail. The problem is the perception of smoothness, and quantifying that. If you have a 25% framerate variation at 45fps you're going to notice it more than a 25% framerate variation at 90fps. 99% time shows when you have a long time away from the average frame rate but not that the workload changes, so is naturally very dependent on the benchmark data, time period and settings.
What I would (and am, but it took me 2 weeks to write this email, I'm so time limited) aim for is to find a way to identify a standard deviation and look for ways to show that. So when you get a line of 20-22ms frames interrupted by a 2x longer frame time and possibly a few half as long frame times (the 22, 22, 58, 12, 12, 22, 22ms pattern) you can identify it, and perhaps count the number of times it happens inside the dataset.
Next up would be 'why' and that can start with game settings - changing MSAA, AO, resolution, looking for game engine bottlenecks and then looking at drivers and CPU config. People have reported stuttering frame rates from different mice, having HT enabled, having the NV AO code running on the AMD card (or vice versa).
In summary - I think the presentation of the data doesn't show the problem at the extent it's an issue for gamers. I think it's too simplistic to say 'more 99% time on card a, it's no good'. But that's an editorial decision for you, not me.
The videos of skyrim were interesting but of no value to me, it's a great way to show people how to idenitify the problem but unless you frame sync the camera to the display and can find a way to reduce the losses of encoding to show it, it's not scientific. Great idea though, help people understand what you're describing.
Thanks for being willing to listen, and have a Merry Christmas :)
My response follows:
Hey, thanks for finally taking time to write. Glad to see you've considered these things somewhat.
I have several thoughts in response to what you've written, but the first and most important one is simply to note that you've agreed with the basic premise that FPS averages are problematic. Once we reach that point and are talking instead about data presentation and such, we have agreed fundamentally and are simply squabbling over details. And I'm happy to give a lot of ground on details in order to find the best means of analyzing and presenting the data to the reader in a useful format.
With that said, it seems to me you've concentrated on a single part of our data presentation, the 99th percentile frame time, and are arguing that the 99th percentile frame time doesn't adequately communicate the "smoothness" of in-game animation.
I'd say, if you look at our work over the last year in total, you'd find that we're not really asking the 99th percentile frame time to serve that role exclusively or even primarily.
Before we get to why, though, let's establish another fundamental. That fundamental reality is that animation involves flipping through a series of frames in sequence (with timing that's complicated somewhat by its presentation on a display with a fixed refresh rate.) The single biggest threat to smooth animation in that context is delays or high-latency frames. When you wait too long for the next flip, the illusion of motion is threatened.
I'm much more concerned with high-latency frames than I am with variance from a mean, especially if that variance is on the low side of the mean. Although a series of, say, 33 ms frames might be the essence of "smoothness," I don't consider variations that dip down to 8 ms from within that stream to be especially problematic. As long as the next update comes quickly, the illusion of motion will persist and be relatively unharmed. (There are complicated timing issues here involving the position of underlying geometry at render time and display refresh intervals that pull in different directions, but as long as the chunks of time involved are small enough, I don't think they get much chance to matter.) Variations *above* the mean, especially big ones, are the real problem.
At its root, then, real-time graphics performance is a latency-sensitive problem. Our attempts to quantify in-game smoothness take that belief as fundamental.
Given that, we've borrowed the 99th percentile latency metric from the server world, where things like database transaction latencies are measured in such terms. As we've constantly noted, the 99th percentile is just one point on a curve. As long as we've collected enough data, though, it can serve as a reliable point of comparison between systems that are serving latency-sensitive data. It's a single sample point from a large data set that offers a quick summary of relative performance.
With that in mind, we've proposed the 99th percentile frame time as a potential replacement for the (mostly pointless) traditional FPS average. The 99th percentile frame time has also functioned for us as a companion to the FPS average, a sort of canary in the coal mine. When the two metrics agree, generally that means that frame rates are both good *and* consistent. When they disagree, there's usually a problem with consistent frame delivery.
So the 99th percentile does some summary work for us that we find useful.
But it is a summary, and it rules out the last 1% of slow frames, so I agree that it's not terribly helpful as a presentation of animation smoothness. That's why our data presentation includes:
1) a raw plot of frame times from a single benchmark run,
2) the full latency curve from 50-100% of frames rendered,
3) the "time spend beyond 50 ms" metric, and
4) sometimes zoomed-in chunks of the raw frame time plots.
*Those* tools, not the 99th percentile summary, attempt to convey more useful info about smoothness.
My favorite among them as a pure metric of smoothness is "time spent beyond 50 ms."
50 milliseconds is our threshold because at a steady state it equates to 20 FPS, which is pretty slow animation, where the illusion of motion is starting to be compromised. (The slowest widespread visual systems we have, in traditional cinema, run at 24 FPS.) Also, if you wait more than 50 ms for the next frame on a 60Hz display with vsync, you're waiting through *four* display refresh cycles. Bottom line: frame times over 50 ms are problematic. (We could argue over the exact threshold, but it has to be somewhere in this neighborhood, I think.)
At first, to quantify interruptions in smooth animation, we tried just counting the number of frames that take over 50 ms to render. The trouble with that is that a 51 ms frame counts the same as a 108 ms frame, and faster solutions can sometimes end up producing *more* frames over 50 ms than slower ones.
To avoid those problems, we later decided to account for how far the frame times are over our threshold. So what we do is add up all of the time spent rendering beyond our threshold. For instance, a 51 ms frame adds 1 ms to the count, while an 80 ms frame adds 30 ms to our count. The more total time spent beyond the threshold, the more the smoothness of the animation has been compromised.
It's not perfect, but I think that's a pretty darned good way to account for interruptions in smoothness. Of course, the results from these "outlier" high-latency frames can vary from run to run, so we take the "time beyond X" for each of the five of the test runs we do for each card and report the median result.
In short, I don't disagree entirely with your notion that the 99th percentile frame time doesn't tell you everything you might need to know. That's why our data presentation is much more robust than just a single number, and why we've devised a different metric that attempts to convey "smoothness"--or the lack of it.
I'd be happy to hear your thoughts on alternative means of analyzing and presenting frame time data. Once we agree that FPS averages hide important info about slowdowns, we're all in the same boat, trying to figure out what comes next. Presenting latency-sensitive metrics is a tough thing to do well for a broad audience that is accustomed to much simpler metrics, and we're open to trying new things that might better convey a sense of the realities involved.
And here is James's reply:
First up, yes I absolutely agree that FPS averages aren't the complete picture. Your cogent and comprehensive response details the thinking behind your methodology very nicely. You are correct, I did choose to highlight 99% time as my first point, and your clarification regarding the additional data you review and present is well taken.
I agree with you about the 50ms/20fps ‘line in the sand’, for watching animated pictures. My personal threshold for smoothness in movies is about 17-18, my wife’s is 23.8. For gaming however, I find around 35fps / 29ms per frame is where I get pissed off and call it an unplayable slideshow unless it is an RTS - I was prepared to hate C&C locked at 30fps but found it quite pleasant. This was based on not only animation smoothness but smoothness of response to input. Human perception is a funny thing, it changes with familiarity and temperament.
So on that basis I concur that dipping from 22ms to 50ms is perceptible in ‘palm of the hand’ and 99% plus 50ms statistics address identify that nicely. Where I disagree with you is the moving from 22ms to say 11ms isn't noticeable, especially if it is an experientially significant amount of time for the latency consumer - the player. Running along at 22ms and switching to 11ms probably won’t be perceived badly, but the regression back to 22ms might be, especially if it happens frequently. I experienced this first hand when I benched Crossfire 7870’s in Eyefinity, with VECC added SSAA. The fraps average was high, in the 60’s, the min was around 38. The problem was the feel, it looked smooth, but the response was input was terrible. The perceived average FPS was closer to the minimum and wasn’t smooth and so despite being capable of stutter free animation, the playability was ruined due to frame rate variation from 38fps to ~90fps. The problem ended up being memory bandwidth, as increasing clocks improved the feel and reduced the variation; this was reinforced by moving from SSAA to AAA and standard MSAA; the less intensive modes were silky smooth, AAA being in the same general performance range.
This can be observed on the raw frame rate graph, a saw tooth pattern will be seen if the plot resolution is right, but when examining a plot covering perhaps minutes of data showing tens of frame render times per second then you need a systemic approach for consistency and time cost of the analyst.
The obvious answer is to restrict your input data, find a benchmark session that doesn’t do that but then you end up with a question of usefulness to your latency processor again - the player. Does the section of testing represent the game fairly? Is the provided data enough for someone to know that the card will cope with the worst case scenarios of the game, is there enough data for each category of consumer - casual player, IQ/feature enthusiast, game enthusiast, performance enthusiast, competitive gamer, system builder, family advisor, mom upgrading little Jonny’s gateway - to understand the experience?
Servers talk to servers, games talk to people. We can base analysis methodology on what comes from the server world, and then move on to finding a way to consistently quantify the experience so that the different experience levels show through.
I'll confess I still owe him a response to this message. We seem to have ended up agreeing on the most important matters at hand, though, and the issues he raises in his reply are a bit of a departure from our initial exchange. It seems to me James is thinking in the right terms, and I look forward to seeing how he implements some of these ideas in his own game testing in the future.
You can follow me on Twitter for more nerdy exchanges.Freshening up a home network can yield big bandwidth benefits
One of the funny things about being a PC enthusiast, for me, is how there's a constant ebb and flow of little projects that I end up tackling. At one point, I may be busy updating and tuning my HTPC, and shortly after that's finished, I'm on to something else. One way or another, it seems I'm almost always trying to fix or improve something.
My project lately has been optimizing my home network. By nature, my hardware testing work requires me to move lots of data around, whether it's deploying images to test rigs, downloading new games from Steam, or uploading videos to YouTube. I've noticed that I spend quite a bit of time waiting on various data transfer operations. Within certain limits, that's probably an indicator that some money could be well spent on an upgrade.
The first step in the process was getting my cable modem service upgraded. I'm too far out in the 'burbs to partake of the goodness of Google Fiber happening in downtown Kansas City, so I'm stuck with Time Warner Cable.
For a while, I'd been paying about 60 bucks a month for Road Runner "Turbo" cable modem service with a 15Mbps downstream and a 1Mbps upstream. We use a host of 'net based services like Netflix and Vonage, along with the aforementioned work traffic and hosting a Minecraft server for my kids, so both the upstream and downstream were feeling sluggish at times.
Time Warner Cable's website told me I could get 20Mbps downstream and 2Mbps upstream for $49.99 a month here in my area. There's also an option for 30Mbps down and 5Mbps up for $59.99. I was vaguely aware that my old-ish cable modem would have to be replaced with a newer model to enable the higher speed service, so I disconnected the modem and headed to the local Time Warner store, hoping to exchange it and upgrade my service.
When I got there, the salesperson informed me I could upgrade, but insisted that I'd need to pay an additional $15 per month above my current rate in order to get 20Mbps/2Mbps service. I asked if she was sure about that and whether there were any better pricing options, but she insisted. As she typed away, beginning the service change, I pulled up the Time Warner website on my phone, attempting to get that pricing info—which was conveniently hidden on the mobile site. I fumbled for a while as she kept typing, because apparently service tier changes require a 25-page written report. Only after my third inquiry, some bluster from me, and a whole lot more typing did she decide that she could give me the $49.99 price for 20Mbps/2Mbps service.
I later talked another rep into switching me to the 30Mbps/5Mbps service for $59.99, instead. Heh.
Anyhow, I eventually came home with a rather gigantic new cable modem and, for the about same price I'd been paying before, started enjoying double the downstream bandwidth and 5X the upstream. The difference is very much noticeable in certain cases, such as Steam downloads and YouTube uploads.
I suppose the morals of this story are: 1) if you have an older cable modem, you may be able to get faster service by swapping it out for a newer one, thanks to newer DOCSIS tech, and 2) you may also be eligible for better pricing if you do some research and prod your service provider sufficiently. Don't just take what they're giving you now or even the newer options they're offering to existing customers. Look into the offers they're making to new customers, instead, and insist on the best price.
Only days after I'd posted my shiny new Speedtest.net results on Twitter, I turned my attention to our internal home network. Although I really like my Netgear WNDR3700 router, we've never used it to its full potential. The 5GHz band is practically empty, either due to lack of device support or range issues. Signals in that band just won't reach reliably into most of the bedrooms, so it's a no-go for anything mobile.
The range is great on our 2.4GHz network, but transfer rates are kind of pokey. There are many reasons for that. At the top of the list is a ridiculous number of devices connected at any given time. Between phones, tablets, PCs, and other devices, I can count 12 off the top of my head right now. There may be more.
You may be in the same boat. I didn't plan for this; it just happened.
Also, we have a silly number of other devices throwing off interference in the 2.4GHz range, including wireless mice, game controllers, Bluetooth headsets, the baby monitor, apparently our microwave oven, and probably a can opener or something, too.
One particular client system, my wife's kitchen PC, really needed some help. We store all of our family photos and videos on my PC, and my wife accesses them over the Wi-Fi network. As the megapixel counts for digital cameras have grown, so has her frustration. The process of pulling up thumbnails in a file viewer was excruciating.
Her system had a 2.4GHz 802.11g Wi-Fi adapter in it, which caused several problems. One was its own inherent limit of 54Mbps peak transfer rates. The other was the fact that, in order to best accommodate it and other older Wi-Fi clients, I had switched my router's 2.4GHz Wi-Fi mode from its "Up to 130Mbps" default mode to "Up to 54Mbps"—that setting seemed to help the Kitchen PC, but at the cost of lower peak network speeds for wireless-n clients.
This problem should have been solved ages ago, but it had momentum on its side. The Kitchen PC's motherboard had a built-in Wi-Fi adapter with a nice integrated antenna poking out of the port cluster, and I was reluctant to change it. However, a quick audit of the devices on our network revealed something important: the Kitchen PC's 802.11g adapter was the only 802.11b/g client left on our network. Replacing its Wi-Fi adapter wouldn't just speed up its connection; it would also allow me to experiment with the higher-bandwidth 2.4GHz modes on my router.
Once I resolved to make a change, it was like the girls from Jersey Shore: stupidly cheap and easy. I decided to measure the impact of various options by noting the speed of Windows file copy to the Kitchen PC. With its built-in 802.11g adapter, which has a stubby antenna attached, file copies averaged 2MB/s.
I then disabled the internal adapter and switched to an insanely tiny USB-based 802.11n adapter that I happened to have on hand. These things cost ten bucks and have zero room for an antenna, but they seem to work. I also switched the router to "Up to 130Mbps" mode on the 2.4GHz band, since the last legacy device was gone. The changes didn't help much; copies averaged 1.88MB/s, practically the same. However, when I flipped the router into its 20/40Hz mode ("Up to 300Mbps"), transfer rates more than doubled, to 5MB/s.
Better, but not great.
To really improve, I needed to make use of that practically empty 5GHz bandwidth. As a stationary system not far from the router, the Kitchen PC was a perfect candidate. I ordered up a Netgear dual-band USB Wi-Fi adapter—20 bucks for a refurb—to make it happen. This adapter is large enough to have a decent-sized internal antenna, in addition to the dual-band capability. Once it was installed, Windows file copy speeds on the 5GHz band (in 20/40Hz mode) were a steady 14MB/s—fully seven times what they were initially. And that's with just four of out five bars of signal strength.
There are a couple of lessons here, too, I think. First, wireless-b and -g devices are really stinkin' old, and moving to better adapter hardware is worth the modest cost involved. Getting rid of those old clients may even help speed up your whole network. Second, if you have a dual-band router with lots of clients, make use of that 5GHz bandwidth where possible, especially on stationary systems that are in range of the base station.
Of course, the big takeaway for this entire episode was this: devoting some attention to your home network can yield some nice benefits, especially if you've neglected it a bit. And heck, I haven't even started down the path to 802.11ac. Yet.AMD attempts to shape review content with staged release of info
Review sites like TR are a tricky business, let me say up front. We work constantly with the largest makers of PC hardware in order to bring you timely reviews of the latest products. Making that happen, and keeping our evaluations fair and thorough, isn't easy in the context of large companies engaging in cutthroat competition over increasingly complex technologies.
I know for a fact that many folks who happen across TR's reviews are deeply skeptical about the whole enterprise, and given everything that goes on in the shadier corners of the web, they have a right to be. That said, we have worked very hard over the years to maintain our independence and to keep our readers' interests first among our priorities, and I think our regular audience will attest to that fact.
At its heart, the basic arrangement that we have with the largest PC chip companies is simple. In exchange for early access to product samples and information, we agree to one constraint: timing. That is, we agree not to post the product information or our test results until the product's official release.
That's it, really.
There are a few other nuances, such as the fact that we're released from that obligation if the information becomes public otherwise, but they only serve to limit the extent of the agreement.
In other words, we don't consent to any other constraint that would compromise our editorial independence. We don't guarantee a positive review; we don't agree to mention certain product features; and we certainly don't offer any power over the words we write or the results we choose to publish. In fact, by policy, these companies only get to see our reviews of their products when you do, not before.
If you're familiar with us, we may be covering well-trodden ground here, but bear with me. Our status as an independent entity is key to what we do. Most of the PR types we work with tend to understand that fact, so we usually get along pretty well. There's ample room for dialog and persuasion about the merits of a particular product, but ultimately, we offer our own opinions. In fact, the basic arrangement we have with these firms has been the same for most of the 13 years of our existence, even during the darkest days of Intel's Pentium 4 fiasco.
You can imagine my shock, then, upon receiving an e-mail message last week that attempted to re-write those rules in a way that grants a measure of editorial control to a company whose product we're reviewing. What AMD is doing, in quasi-clever fashion, is attempting to shape the content of reviews by dictating a two-stage plan for the release of information. In doing so, they grant themselves a measure of editorial control over any publication that agrees to the plan.
In this case, the product in question is the desktop version of AMD's Trinity APUs. We received review samples of these products last week, with a product launch date set for early October. However, late last week, the following e-mail from Peter Amos, who works in AMD's New Product Review Program, hit our inbox:
We are allowing limited previews of the embargoed information to generate additional traffic for your site, and give you an opportunity to put additional emphasis on topics of interest to your readers. If you wish to post a preview article as a teaser for your main review, you may do so on September 27th, 2012 at 12:01AM EDT.
The topics which you are free to discuss in your preview articles starting September 27th, 2012 at 12:01AM EDT are any combination of:
- Gaming benchmarks (A10, A8)
- Speeds, feeds, cores, SIMDs and branding
- Experiential testing of applications vs Intel (A10 Virgo will be priced in the range of the i3 2120 or i3 3220)
- Power testing
We believe there are an infinite number of interesting angles available for these preview articles within this framework.
We are also aware that your readers expect performance numbers in your articles. In order to allow you to have something for the preview, while maintaining enough content for your review, we are allowing the inclusion of gaming benchmarks.
By allowing the publication of speeds, feeds, cores, SIMDs and branding during the preview period, you have the opportunity to discuss the innovations that AMD is making with AMD A-Series APUs and how these are relevant to today’s compute environment and workloads.
In previewing x86 applications, without providing hard numbers until October [redacted], we are hoping that you will be able to convey what is most important to the end-user which is what the experience of using the system is like. As one of the foremost evaluators of technology, you are in a unique position to draw educated comparisons and conclusions based on real-world experience with the platform.
The idea here is for AMD to allow a "preview" of the product that contains a vast swath of the total information that one might expect to see in a full review, with a few notable exceptions. Although "experiential testing" is allowed, sites may not publish the results of non-gaming CPU benchmarks.
The email goes on to highlight a few other features of the Socket FM2 platform before explaining what information may be published in early October:
The topics which you must be held for the October [redacted] embargo lift are:
- Non game benchmarks
The email then highlights each of these topic areas briefly. Here's what it says about the temporarily verboten non-gaming benchmarks:
Non game benchmarks
- Traditional benchmarks are designed to highlight differences in different architectures and how they perform. We understand that this is a useful tool for you and that your readers expect to see this data. The importance of these results is in your evaluation, as the leading experts, of what these performance numbers mean. We encourage you to use your analysis if you choose to publish a preview article and if you find that to be appropriate to your approach to that article. The numbers themselves must be held until the October [redacted] embargo lift. This is in an effort to allow consumers to fully comprehend your analysis without prejudging based on graphs which do not necessarily represent the experiential difference and to help ensure you have sufficient content for the creation of a launch day article.
Now, we appreciate that AMD is introducing this product in an incredibly difficult competitive environment. We're even sympathetic to the idea that the mix of resources included in its new APU may be more optimal for some usage patterns, as our largely positive review of the mobile version of Trinity will attest. We understand why they might wish to see "experiential testing" results and IGP-focused gaming benchmarks in the initial review that grabs the headlines, while burying the CPU-focused benchmarks on a later date. By doing so, they'd be leading with the product's strengths and playing down its biggest weakness.
And it's likely to work, I can tell you from long experience, since the first article about a subject tends to capture the buzz and draw the largest audience. A second article a week later? No so much. Heck, even if we hold back and publish our full review later (which indeed is our plan), it's not likely to attract as broad a readership as it would have on day one, given the presence of extensive "previews" elsewhere.
Yes, AMD and other firms have done limited "preview" releases in the past, where select publications are allowed to publish a few pictures and perhaps a handful of benchmark numbers ahead of time. There is some slight precedent there.
But none of that changes the fact that this plan is absolutely, bat-guano crazy. It crosses a line that should not be crossed.
Companies like AMD don't get to decide what gets highlighted in reviews and what doesn't. Using the review press's general willingness to agree on one thing—timing—to get additional control may seem clever, but we've thought it over, and no. We'll keep our independence, thanks.
The email goes on to conclude by, apparently, anticipating such a reaction and offering a chance for feedback:
We are aware that this is a unique approach to product launches. We are always looking at ways that we can work with you to help drive additional traffic to your articles and effectively convey the AMD message. We strive to provide the best products in their price points, bringing a great product for a great price. Please feel free to provide feedback on what you find, both with the product and with your experience in the AMD New Product Review Program. We try to ensure that we are providing you what you need and appreciate any feedback you have to offer on how we can do better.
I picked up the phone almost immediately after reading this paragraph and attempted to persuade both Mr. Amos and, later, his boss that this plan was not a good one. I was told that this decision was made not just in PR but at higher levels in the company and that my objections had been widely noted in internal emails. Unfortunately, although fully aware of my objections and of the very important basic principle at stake, AMD decided to go through with its plan.
Shame on them for that.
It's possible you may see desktop Trinity "previews" at other websites today that conform precisely to AMD's dictates. I'm not sure. I hope most folks have decided to refrain from participation in this farce, but I really don't know what will happen. I also hope that any who did participate will reconsider their positions after reading this post and thinking about what they're giving up.
And I hope, most of all, that the broader public understands what's at stake here and insists on a change in policy from AMD.
If this level of control from companies over the content of reviews becomes the norm, we will be forced to change the way we work the firms whose products we review. We will not compromise our independence. We believe you demand and deserve nothing less.
Update: AMD has issued a statement on this matter.A look at TR's new GPU test rigs
As I mentioned on the podcast this week, I have been working to re-fit Damage Labs with new hardware all around. Since I test desktop GPUs, desktop CPUs, and workstation/server CPUs, I have a number of test rigs dedicated to each area. Our desktop CPU and GPU systems have been the same for quite some time now. Heck, my massive stable of 30+ CPU results dates back to the Sandy Bridge launch. However, as time passes, new hardware and software replaces the old, and we must revamp our test systems in order to stay current. Oddly enough, we've just hit such an inflection point in all of the types of hardware I test pretty much at the same time. Normally, these things are staggered out a bit, which makes the change easier to manage.
Fortunately, though, I've been making solid progress on all fronts.
The first of my test rigs to get the treatment are my two graphics systems—identical, except one is dedicated to Nvidia cards and the other to AMD Radeons, so we can avoid video drivers for one type of GPU causing problems for the other. Also, I can test two different configurations in parallel, which really helps with productivity when you're running scripted benchmarks and the like.
The old GPU rigs were very nice X58 systems that lasted for years, upgraded along the way from four cores to six and from hard drives to SSDs. They're still fast systems, but it was time for a change. Let me give you a quick tour of our new systems, and we'll talk about the reasons for the upgrade.
Behold, the new Damage Labs GPU test rig. Innit pretty? In the past, our open-air test rigs have sat on a motherboard box, with the PSU sitting on one side and the drives out front. This system, however, is mounted in a nifty open-air case that the folks at MSI happened to throw into a box with some other hardware they were shipping to us. I was intrigued and put the thing together, and it looks to be almost ideal for our purposes. I'm now begging MSI for more. If we can swing it, we may even give away one of these puppies to a lucky reader. That may be the only way to get one, since this rack apparently isn't a commercial product.
Here are a few more shots from different angles.
Nifty and pretty tidy, all things considered. Even takes up less room on the test bench.
Now, let's talk specs. I had several goals for this upgrade, including the transition to PCI Express 3.0, a lower noise floor for measuring video card cooler acoustics, and lower base system power draw. I think the components I've chosen have allowed me to achieve all three.
CPU and mobo: Intel Core i7-3820 and Gigabyte X79-UD3 - The X79 platform is currently the only option if you want PCIe 3.0 support. Of course, even after Ivy Bridge arrives with PCIe Gen3 for lower-end systems, the X79 will be the only platform with enough PCIe lanes to support dual-x16 or quad-x8 connectivity for multi-GPU rigs.
Obviously, the conversion to PCI 3.0 essentially doubles the communications bandwidth available, but that's not all. The integration of PCIe connectivity directly into the CPU silicon eliminates a chip-to-chip "hop" in the I/O network and should cut latency substantially, even for graphics cards that only support PCIe Gen2.
The Core i7-3820 is the least expensive processor for the X79 platform, making it an easy choice. Yes, we've dropped down a couple of cores compared to our prior-gen GPU rigs. That's partly because I didn't want to get too far into exotic territory with these new systems. With four cores and a Turbo peak of 3.8GHz, the Core i7-3820 should perform quite similarly to a Core i7-2600K in cases where the X79 platform's additional bandwidth is no help.
We did want to be able to accommodate the most extreme configurations when the situation calls for it, though. That's one reason I selected Gigabyte's X79-UD3 mobo for this build. Even some of the more expensive X79 boards don't have four physical PCIe x16 slots onboard like the UD3 does. Those slots are positioned to allow four double-width cards at once, making the UD3 nearly ideal for this mission.
Cramming in all of those slots and the X79's quad memory channels is no minor achievement, and it did require some compromises. The UD3 lacks an on-board power button, a common feature that's only important for, well, open-air test rigs like this one. Also, the spacing around the CPU socket is incredibly tight. With that big tower cooler installed, reaching the tab to release the retention mechanism on the primary PCIe x16 slot is nearly impossible. I had to jam part of a zip tie into the retention mechanism, semi-permanently defeating it, in order to make card swaps easier.
Still, I'm so far pleased with Gigabyte's new EFI menu and with the relatively decent power consumption of the system, which looks to be about 66W at idle with a Radeon HD 7970 installed. That's roughly 40W lower than our prior test rigs, a considerable decrease.
Memory: Corsair Vengeance 1600MHz quad-channel kit, 16GB - If you're going X79, you'll need four fast DIMMs to keep up, and Corsair was kind enough to send out some Vengeance kits for us to use. Setup is dead simple with the built-in memory profile, supported by the UD3.
PSU: Corsair AX850 - Our old PC Power & Cooling Silencer 750W power supplies served us well for years, but they eventually developed some electronics whine and chatter under load that interfered with our acoustic measurements. It was time for a replacement, and the wonderfully modular Corsair AX850 fit the bill. Although 850W may seem like overkill, we had some tense moments in the past when we pushed our old 750W Silencers to the brink. I wanted some additional headroom. It didn't hurt that the AX850 is 80 Plus Gold certified, and I think the nice reduction we've seen in system-wide idle power draw speaks well of this PSU's efficiency at lower loads. (In fact, when the 7970 goes into its ZeroCore power mode, system power draw drops to 54W.) Even better, when load is 20% or less of peak, the AX850 completely shuts down its cooling fan. That means our idle acoustic measurements should be entirely devoid of PSU fan noise.
CPU cooler: Thermaltake Frio - The original plan was to use Thermaltake's massive new Frio OCK coolers on these test rigs, but the OCK literally would not fit, because the fans wouldn't allow clearance for our relatively tall Vengeance DIMMs. That discovery prompted a quick exchange with Thermaltake, who sent out LGA2011 adapter kits for the older original Frio coolers we had on hand. Although the original Frio isn't that much smaller than the OCK version, we were able to shoehorn a Frio in a single-fan config into this system. The fan enclosure does push up against one DIMM slightly, but that hasn't caused any problems. With a cooler this large, we can keep the fan speed cranked way down, so the Frio is blessedly quiet, without the occasional pump noise you get from the water coolers often used in this class of system.
Storage: Corsair F240 SSD and some old DVD drive - The F240 SSD was a fairly recent upgrade to our old test rigs, and it's one of the two components carried over from those systems, along with the ancient-but-still-necessary DVD drive for installing the handful of games we haven't obtained digitally. The biggest drawback to the SSD? Not enough time to read the loading screens between levels sometimes.
That's about it for the specs. I'm very pleased with the power and noise levels of these new systems. The noise floor at idle on our old test rigs, with the meter perched on a tripod about 14" away, was roughly 34 dB. I'm hoping we'll be able to take that lower with these systems, although honestly, driving too far below that may be difficult without a change of environments. Our basement lab is nothing special in terms of acoustic dampening and such. We'll have to see; I haven't managed to squeeze in a late-night acoustic measurement just yet.
For what it's worth, we have considered using a system in a proper PC case for acoustic and thermal measurements, but that hasn't worked out for various reasons, including the sheer convenience for us, typically rushing on some borderline-abusive deadline, of being able to swap components freely. We also have concerns about whether a case will serve to dampen the noise coming from the various coolers, effectively muting differences on our meter readings that the human ear could still perceive. We may still investigate building a dedicated, enclosed acoustic/thermal test rig in the future, though. We'll see.
Now that the new Damage Labs GPU test rigs are complete, I'm sadly not going to be able to put them to use immediately. I have to move on to testing another type of chip first. I'll get back here eventually, though. I still need to test Radeon HD 7900-series CrossFire, and I understand there are some other new GPUs coming before too long, as well.
One of the funny things about going to CES is that you're expected to be plugged into the overall vibe of the show, so you can return and tell your friends and family about "what's hot" in technology. As a journalist, that's especially true, because we have access to press events, show previews, and the like. The trouble is, as I've explained, CES for us is an endless parade of meetings, cab rides, rushed walks, and foot pain. The time we spend on the show floor itself is minimal and mostly involves rushing to that next meeting. Beyond that, we simply don't cover the entire span of consumer electronics and don't get much insight into what's happening in the broader market there—not that, given the scope of CES, any one person or small team really could.
One can catch the vibe of CES in various ways, though. I've already offered my take on the state of the PC industry at CES 2012, which was more about following Apple's template than bold innovations, somewhat unfortunately. In other areas, a few highlights were evident as we rushed through the week.
One new creation that stood out easily at the press-only Digital Experience event was Samsung's amazing demo unit: a 55" OLED television.
This puppy was big and bright, even in the harsh lighting of the MGM Grand ballroom. The most striking thing about it to me, on first glance, was how impossibly thin the bezels were around its edges. To my eye, which has been frequently exposed to various Eyefinity demo rigs and display walls, the sheer thinness of the frame around the screen was jarring—in a good way. After that, one noticed other nice things about this OLED monster versus the average display: near-perfection at difficult viewing angles, amazing brightness and contrast, and much truer blacks than you'd see on an LCD. Unfortunately, this display is still far from being a true consumer product. We didn't get a price tag from the Samsung rep on hand, but the number $50,000 was thrown around only semi-jokingly. If you wanted to see something wondrous from the future at CES 2012, though, the display itself certainly qualified.
Another way you can catch the tech vibe at CES is simply observing the attendees. That's been a reliable method on many fronts, from the number of folks there to the gear they're carrying. In years past, CES has been all about iPhones and an utterly, laughably jammed AT&T network, unable to service 'em all. iDevices were again everywhere at CES 2012—I'd put the iPhone ownership among attendees at somewhere around 50%, easily—but what impressed me this year was the apparent consolidation of non-Apple phones. That contingent didn't consist of a host of smartphones of various types or even a varied selection of Android-based phones. Instead, it seemed like virtually all of the cool kids were toting one of two devices: a Samsung Galaxy S II or a Galaxy Note. Those big, bright screens and thin enclosures were everywhere, and one had to do a double-take at times: does that dude have a really small head, or is he using a Galaxy Note as a phone? Or, you know, perhaps both? In a world where two-year contracts tend to define when a smart phone upgrade makes sense, it's amazing how many CES attendees had upgraded to one of Samsung's new offerings in recent months. Also impressive was how much those big screens and thin cases looked like the future, and how much the tiny little iPhone 4/4S display looked like the past.
CES attendance is also considered something of a bellwether for the tech economy or even the economy as a whole. In 2009, as the wheels were coming off of the banking system, attendance at the show dropped dramatically. I was there, and although things felt a little lonely in the convention center, the upside was most evident: no cab lines, no pressing crowds, few waits at restaurants. Recovery was slow and incremental. The show felt like it was back in force last year, and this year, the crush of people was as inconvenient as anything since 2008, probably up a bit from 2011.
One thing that hasn't changed much is the state of Las Vegas itself. For a number of years, we had the fun task of scoping out the latest massive new casino hotels as they opened up, from Paris and the Venetian to the Wynn and Aria and so on. In 2009, though, commercial building loans dried up, construction stopped, and half-completed structures sat idle, some partially built with cranes atop them. Some still sit that way. One of the more memorable examples was the frame of a new tower for the Venetian, left sitting exposed to the elements for years, obviously rusting. That always stuck out at me, an odd contrast to the bustling activity of the Venetian below.
This year, while approaching the Venetian for the second or third time, I realized I hadn't noticed the half-completed tower yet. That's when I looked up and saw this:
Yep, they've wrapped the rusting frame of the tower in a plastic shroud, colored to look like the buildings around it. That, my friends, is more like what I'd expect from Las Vegas. Let that structure rust in obscurity while giving us the approximation of something better.Some thoughts on Rage
Well, I finished Rage last night. I have to say that I enjoyed it quite a bit, mostly because, at heart, it's a very solid shooter. A number of missions in the very middle of the game, especially those starting from Wellspring, offer an excellent mix of varied environments, interesting enemies, and near-ideal shooter mechanics. Those really pulled me in. I was less thrilled by the game's beginning and ending, especially since the very ending felt rushed, like too many games, where they'd run out of time and budget to make the final battle as epic as those in the middle of the game. Since I had a lot of fun with Rage, and since it's nearly a genre convention these days for a game to end weakly, I can forgive that sin, although the sense of wasted potential is a little saddening.
As I told a friend the other night, I have two thoughts on wingsticks. First, they're a barrel of fun, with a very tangible sense of the ostensible physics involved and excellent animations to go along with them. Winging one of these puppies at a bad guy and watching him take damage is ridiculously satisfying—an instant FPS classic. Second, wingsticks are an obvious concession to the lack of precise control on console gamepads. They're too easy to aim and too powerful in terms of damage dealt, especially since they're so easily interspersed with weapons fire. Wingsticks thus drain much of the drama and challenge out of Rage. Never would have happened if this were a PC-first title. Took me a while to come around to that second line of thinking, but once I did, I couldn't shake that impression.
I said on the last podcast that I had to get over the fact Rage is not Borderlands, and I mostly was able to do so. Rage is smaller, more linear, and more of a pure shooter than Borderlands. Although it has a more limited number of weapons, there are actually much more varied options for creative killin' in Rage thanks to different ammo types and devious devices like the RC bomb cars and sentry bots. I've never gotten into the crazy alt-weapons options in games like BioShock because they just didn't suit me—seemed too contrived, slow, and clumsy compared to, you know, a gun. In Rage, I took special glee in dispatching bad guys with dynamite bolts and other such contrivances, even when they weren't the fastest, because the hilarious carnage was reward enough.
Still, the player limitations built into this game are sometimes frustrating just because they don't seem necessary. You can rarely go off of the intended path, even if doing so would only require stepping over a small brick. You might as well be trying to jump over a skyscraper, for all the good trying will do you. Then, in the final level, I came to what seemed like the obvious and only place to move forward in a small hallway, and it was blocked by a fairly large metal crate (imagine that). Immediately, I backtracked and searched every prior inch of the level looking for another way through. When I found nothing, I went back and considered blowing up the crate or finding some other option. Eventually, to my utter and bewildered shock, I was able to jump over this crate, something the entire rest of the game leading up to that point had meticulously taught me was impossible. Really strange.
Also, the wasteland is too prickly, dangerous, and barren to make freelance exploration rewarding. This game should have been a true open-world affair that encourages improvisation and discovery. The bones are there, but the flesh is not. I know it's not Borderlands, and I swear I'm OK with that, but I still wish Rage was the game it promised to be, the game it cries out to be. Here's hoping for more freedom in Rage 2.
Most of this talk sounds negative, but really, I'm mostly just wishing for more quality time spent in the world of Rage—and, heh, perhaps less spent on silly minigames like the knife thing. The silky-smooth id Software shooter mechanics are as good as ever, and there is a potent mix at work here. The game engine allows for unique textures to be painted on every object in the world, and the game's creative types have taken that ball and run with it, creating levels that are more detailed, varied, interesting, and realistic than anything we've seen before, in a way. Meanwhile, the visuals and action unfold at a constant march of 16 milliseconds per frame (60 FPS if you average it out, but the smoothness is immediate and consistent). Other games may run at high frame rates, but few look this good and move this smoothly all of the time. The fact that this is one of the handful of contemporary games to get multisampled edge antialiasing right, on almost every edge in every scene, also helps immensely. Taken together, these things add up to a full-motion animated experience that's more immersive than most games—and that is perfect for an action game like this one.
Screenshots don't capture the experience, and what they do capture is Rage's one great visual weakness: a frequent lack of texture detail once you get too close to most objects. That weakness is unfortunate, and I understand a patch is coming with a "detail texture" addition that should at least partially alleviate the problem. Even without a fix, though, the in-motion visuals this game slings out can rival or surpass anything else on the PC, with the likely exception of BF3's single-player campaign. Some of the scenes in the game are incredible. Even though they're static, I did grab a few screenshots in places as I played through Rage, and I've put them into the gallery below. Be sure to click the "View full size" button if you'd like to see a scene in its full 2560x1600 8X MSAA glory.
|MSI gaming barebones has Mini-ITX mobo, external overclocking button||28|
|Deal of the week: Discounted Windows and cheap storage||1|
|Fan-made Morrowind remake looks amazing||27|
|Thursday Night Shortbread||38|
|Razer unveils homebrewed mechanical keyboard switches||41|
|Watch Dogs rescheduled for May 27||13|
|Cooler Master's QuickFire Stealth mechanical keyboard reviewed||16|
|Radeon R7 265 becomes available at $149, promptly sells out||39|