review nvidias geforce gtx 750 ti maxwell graphics processor

Nvidia’s GeForce GTX 750 Ti ‘Maxwell’ graphics processor

So this is different. I don’t recall the last time a new GPU architecture made its worldwide debut in a lower-end graphics card—or if I do, I’m not about to admit I’ve been around that long. In my book, then, Nvidia’s “Maxwell” architecture is breaking new ground by hitting the market first in a relatively affordable graphics card, the GeForce GTX 750 Ti, and its slightly gimpy twin, the GeForce GTX 750.

Don’t let the “750” in those names confuse you. Maxwell is the honest-to-goodness successor to the Kepler architecture that’s been the basis of other GeForce GTX 600 and 700 series graphics cards, and it’s a noteworthy evolutionary step. Nvidia claims Maxwell achieves twice the performance per watt of Kepler, without the help of a new chip fabrication process. Given how efficient Kepler-based GPUs have been so far, that’s a bold claim.

I was intrigued enough by Maxwell technology that I’ve hogged the spotlight from Cyril, who usually reviews video cards of this class for us. Having spent some time with them, I’ve gotta say something: regardless of the geeky architectural details, these products are interesting in their own right. If your display resolution is 1920×1080 or less—in other words, if you’re like the vast majority of PC gamers—then dropping $150 or less on a graphics card will get you a very capable GPU. Most of the cards we’ve tested here are at least the equal of an Xbone or PlayStation 4, and they’ll run the majority PC games quite smoothly without compromising image quality much at all.

Initially, I figured I’d try testing these GPUs with some popular games that aren’t quite as demanding as our usual fare. However, I quickly learned these cards are fast enough that Brothers: A Tale of Two Sons and Lego Lord of the Rings don’t present any sort of challenge, even with all of the image quality options cranked. Discerning any differences between the GPUs running these games would be difficult at best, so I was soon back to testing Battlefield 4 and Crysis 3.

Why should that rambling anecdote matter to you? Because if you’re an average dude looking for a graphics card for his average computer so it can run the latest games, this price range is probably where you ought to be looking. I’m about to unleash a whole torrent of technical gobbledygook about GPU architectures and the like, but if you can slog through it, we’ll have some practical recommendations to make at the end of this little exercise, too.

The first Maxwell: GM107

Raster- ized
width (bits)
Cape Verde 16 40/20 640 1 128 1500 123 28 nm
Bonaire 16 56/28 896 2 128 2080 160 28 nm
Pitcairn 32 80/40 1280 2 256 2800 212 28 nm
GK107 16 32/32 384 1 128 1300 118 28 nm
GK106 24 80/80 960 3 192 2540 214 28 nm
GM107 16 40/40 640 1 128 1870 148 28 nm

The first chip based on the Maxwell architecture is code-named GM107. As you can see from the picture and table above, it’s a modestly sized piece of silicon roughly halfway between the GK107 and GK106. Like its predecessors and competition, the GM107 is manufactured at TSMC on a 28-nm process.

Purely on a chip level, the closest competition for the GM107 is the Bonaire chip from AMD. Bonaire powers the Radeon R7 260X and, just like the big Hawaii chip aboard the Radeon R9 290X, packs the latest revision of AMD’s GCN architecture. The GM107 and Bonaire are roughly the same size, and they both have a 128-bit memory interface. Notice that Bonaire has more stream processors and texture filtering units than the GM107. We’ll address this question properly once we’ve established clock speeds for the actual products, but the GM107 will have to make more efficient use of its resources in order to outperform the R7 260X. Something to keep in mind.

The Maxwell GPU architecture

A functional block diagram of the GM107. Source: Nvidia.

Above is a not-terribly-fine-grained representation of the GM107’s basic graphics units. From this altitude, Maxwell doesn’t look terribly different from Kepler, with the same division of the chip in to graphics processing clusters (GPCs, of which the GM107 has only one) and, below that, into SMs or streaming multiprocessors. If you’re familiar with these diagrams, maybe you can map the other units on the diagram to the unit counts in the table above. The two ROP partitions are just above the L2 cache, for instance, and each one is associated with a slice of the L2 cache and a 64-bit memory controller. Although these things seem familiar from its prior GPUs, Nvidia says “all the units and crossbar structures have been redesigned, data flows optimized, power management significantly improved, and so on.” So Maxwell isn’t just the result of copy-paste in the chip design tools, even if the block diagram looks familiar. Maxwell’s engineering team didn’t achieve a claimed doubling of power-efficiency without substantial changes throughout the GPU.

In fact, Nvidia has been especially guarded about what exactly has gone into Maxwell, more so than in the past. These are especially interesting times for GPU development, since the competitive landscape is changing. Nvidia introduced the first mobile SoC with a cutting-edge GPU, the Tegra K1, early this year, and it faces competition not just from AMD but also from formidable mobile SoC firms like Qualcomm. The company has had to adapt its GPU design philosophy to focus on power efficiency in order to play in the mobile space. Kepler was the first product of that shift, and Maxwell continues that trajectory, evidently with some success. Nvidia seems to be a little skittish about divulging too much of the Maxwell recipe, for fear that it could inspire competitors to take a similar path.

With that said, we still know about the basics that distinguish Maxwell from Kepler. The most important ones are in the shader multiprocessor block, or SM. Let’s put on our extra-powerful glasses and zoom in on a single SM to see what’s inside.

A functional block diagram of the Maxwell SM. Source: Nvidia.

You may recall that the Kepler SMX is a big and complex beast. The SMX has four warp schedulers, eight instruction dispatch units, four 32-wide vector arithmetic logic units (ALUs), and another four 16-wide ALUs. (“Warps” is an Nvidia term that refers to a group of 32 threads that execute together. These groupings are common in streaming architectures like this one. AMD calls its thread groups “wavefronts.”) That gives the SMX a total of 192, uhh, math units—thanks to four vec32 ALUs and four vec16 ALUs. Nvidia says the Kepler SM has 192 “CUDA cores,” but that’s a marketing term intended to incite serious nerd rage. We’ll call them stream processors, which is somewhat less horrible.

Anyhow, Maxwell divvies things up inside of the SM a little differently. One might even say this so-called SMM is a quad-core design, if one were determined to use the word “core” more properly. The Maxwell SM is divided into quads, anyhow. Each quad has a warp scheduler, two dispatch units, a dedicated register file, and single vec32 ALU. The quads have their own banks of load/store units, and they also have their own special-function units that handle tricky things like interpolation and transcendentals.

Nvidia’s architects have rejiggered the SM’s memory subsystem, too. For instance, the texture cache has been merged with the L1 compute cache. (Formerly, a partitioned chunk of the SM’s 64KB shared memory block served as the L1 compute cache.) Naturally, each L1/texture cache is attached to a texture management unit. Each pair of quads shares one of these texture cache/filtering complexes. Separately, the 64KB block of shared memory remains, and as before, it services the entire SM.

Maxwell’s control logic and execution resources are more directly associated with one another than in Kepler, and the scale of the SM itself is somewhat smaller. One Maxwell SM has 128 stream processors and eight texels per clock of texture filtering, down by one third and one half, respectively, from Kepler. The number of load/store and special-function units apparently remains the same. Nvidia says the Maxwell SM achieves about 90% of the performance of the Kepler SM in substantially less area. To give you some sense of the scale, the GM107 occupies about 24% more area than the GK107, yet the Maxwell-based chip has 66% more stream processors. Due to more efficient execution, the firm claims the GM107 manages about 2.3X the shader performance of the GK107.

How does Maxwell manage those gains? Well, the higher ratio of compute to texturing doesn’t hurt—the SM has shifted from a rate of 12 flops for every texel filtered to 16. Meanwhile, Nvidia contends that much of the improvement comes from smarter, simpler scheduling that keeps the execution resources more fully occupied. Kepler moved some of the scheduling burden from the GPU into the compiler, and Maxwell reputedly continues down that path. Thanks to its mix of vec16 and vec32 units, the Kepler SM is surely somewhat complicated to manage, with higher execution latencies for thread groups that run on those half-width ALUs. A Maxwell quad outputs one warp per clock consistently, with lower latency. That fact should simplify scheduling and reduce the amount of overhead required to track thread states. I think. The methods GPUs use to keep themselves as busy—and efficient—as possible are still very much secret sauce.

One change in the new SM will be especially consequential for certain customers—and possibly for the entire GPU market. Maxwell restores a key execution resource that was left out of Kepler: the barrel shifter. The absence of this hardware doesn’t seem to have negative consequences for graphics, but it means Kepler isn’t well-suited to the make-work algorithms used by Litecoin and other digital currencies. AMD’s GCN architecture handles this work quite well, and Radeons are currently quite scarce in North America since coin miners have bought up all of the graphics cards. The barrel shifter returns in Maxwell, and Nvidia claims the GM107 can mine digital currencies quite nicely, especially given its focus on power efficiency.

Beyond the SM, the other big architectural change in Maxwell is the growth of the L2 cache. The GM107’s L2 cache is 2MB, up from just 256KB in the GK107. This larger cache should provide two related benefits: bandwidth amplification for the GPU’s external memory and a reduction in the power consumed by doing expensive off-chip I/O. Caches keep growing in importance (and size) for graphics hardware for exactly these reasons. I’m curious to see whether the upcoming larger chips based on Maxwell follow the GM107’s lead by including L2 caches eight times the size of their predecessors. That may not happen. Nvidia GPU architect Jonah Alben tells us the L2 cache size in Maxwell is independent of the number of SMs or flops on tap.

Along with everything else, the dedicated video processing hardware in Maxwell has received some upgrades. The video encoder can compress video (presumably 1080p) to H.264 at six to eight times the speed of real-time. That’s up from 4X real-time in Kepler. Meanwhile, video decoding is 8-10X faster than Kepler due in part to the addition of a local cache for the decoder hardware. This big performance boost probably isn’t needed by itself, but again, the goal here is to save power. Along those lines, Nvidia’s engineers have added a low-power sleep state, called GC5, to the chip for video playback and other light workloads.

The GeForce GTX 750 and 750 Ti

The GeForce GTX 750 Ti reference card.

Nvidia has produced a couple of graphics card models based on the GM107, the GeForce GTX 750 Ti and the GeForce GTX 750. Here are their base specifications.

TDP Price
GTX 750 1020 1085 16 32 512 5.0 128 55W $119
GTX 750 Ti 1020 1085 16 40 640 5.4 128 60W $149

Notice the low TDP ratings, which make possible things like the stubby little cooler on the GTX 750 Ti reference card pictured above. Neither card requires an auxiliary PCIe power connector, so they should be able to go into a whole host of systems where other cards can’t quite fit. The recommended PSU capacity is just 300W.

Nvidia pitches the GTX 750 series as ideal for a home-theater PC or a Steam box, many of which are squeezing into ultra-compact cases with smaller power supplies. And dude, if you’ve gotten a Dell that doesn’t have game, one of these cards should be able to slide into a PCIe expansion slot and provide a substantial upgrade over integrated graphics of any sort.

There’s not a tremendous amount of difference between these two products, as you can see in the table above. Essentially, the 750 uses a GM107 with one SM disabled, while the 750 Ti keeps all five SMs intact. Beyond that, GTX 750 cards will generally ship with 1GB of GDDR5 memory, while the 750 Ti will come with 2GB. Both of these cards should be available at online retailers right now. A third variant, a GTX 750 Ti with only 1GB of memory, is slated for release later this month for $139.99.

The GTX 750 series replaces several existing products, including the GeForce GTX 650 Ti and 650 Ti Boost. The GTX 750 Ti is stepping into some big shoes at $149, since the GTX 650 Ti Boost is based on the larger GK106 chip, has a 192-bit memory interface, and is a 110W part. That’s an awful lot to overcome. Even with Maxwell’s higher efficiency, the GTX 750 Ti may not be able to match the Boost’s performance. Then again, the 650 Ti Boost has hit end-of-life and is already hard to find in stores. Only one GTX 600-series card, the GeForce GTX 650, will stick around to serve the under-$100 market.

Zotac has provided us with a couple of GTX 750-series cards to review, one each of the 750 and the 750 Ti. The two cards are nearly identical; in our case, the Ti card is the one with the clear fan. Zotac has set the base and boost clocks for both of these cards at 1033 and 1111MHz, respectively, which is a smidgen higher than stock. Even with the handsome coolers and faster speeds, though, Zotac doesn’t ask anything more than Nvidia’s suggested retail price.

One glance will tell you Asus has taken a more upscale approach with its GTX 750 Ti OC. This card’s GPU base and boost clocks are a little higher than the Zotac’s, at 1072 and 1150MHz, but its 2GB of GDDR5 memory remains at a stock 5.4Gbps. What you’re getting here is an unnecessarily long circuit board that looks to be a custom design, backed up by a needlessly monstrous dual-fan cooler and an unnecessary six-pin aux power input. This card is clearly built for exceeding the GPU’s intended specifications via some egregious overclocking. Asus has also upgraded the HDMI port to full-sized and has added—get this—ye olde VGA port, for that one dude who can’t seem to let go of his Trinitron.

The extra goodness in this version of the GTX 750 Ti will set you back $10 more than Nvidia’s list, or $159.99. One caveat here is the placement of that auxiliary power connector, which is weirdly on the “wrong” end of the card. Having the connector there may be a help or a hindrance, I suppose, depending on the layout of the PC case in question.

AMD’s answer: the Radeon R7 265 and friends
If you’re wondering how AMD is going to respond to Nvidia’s brand-new architecture, well, we already kind of know. From their lofty perch just outside of Toronto, AMD’s graphics honchos saw Maxwell coming ahead of time and decided to form a greeting committee.

Last week, the red team unveiled the Radeon R7 250X, a rebadged version of the Radeon HD 7770 for $99. That was just the beginning, though. A few days later came the proper response to the GM107. First, AMD dropped the price of the Bonaire-based R7 260X to $119, placing it directly opposite the GeForce GTX 750. Then it made a classic move, pulling a larger chip with a 256-bit memory interface down into this price range. The Radeon R7 265 is a hotter-clocked version of the Radeon HD 7850, and AMD expects it to sell for $149.99 when it hits stores at the end of this month.

TDP Price
1000 16 40/20 640 4.5 128 95W $99
1100 16 56/28 896 6.0 128 115W $119
R7 265 925 32 64/32 1024 5.6 256 150W $149

I’m pretty sure the R7 265 review sample AMD sent us is just a Sapphire 7850 card with a new BIOS flashed to it. Have a look.

The R7 265 is a 150W card that requires an aux power input, so it’s almost an entirely different class of solution than the GTX 750 Ti. As long as it’s selling for the same price, though, the R7 265 should present some stiff competition for Nvidia’s much lower-spec offering.

How they match up
When you take all of the specifications and do the math, here’s how the various contenders in this space match up in key graphics rates.

Peak pixel
fill rate
GTX 650
17 34/34 0.8 1.1 80
GTX 650 Ti
15 59/59 1.4 1.9 86
GTX 650 Ti Boost
25 66/66 1.6 2.1 144
GTX 750
17 35/35 1.1 1.1 80
GTX 750 Ti
17 43/43 1.4 1.1 86
R7 250X
16 40/20 1.3 1.0 72
R7 260X
18 62/31 2.0 2.2 96
R7 265
30 59/30 1.9 1.9 179

From a chip-nerd perspective, the most intriguing comparison here is the Radeon R7 260X—the full-fledged version of Bonaire—versus the GeForce GTX 750 Ti—the full-fledged implementation of the GM107. On paper, the 260X has an advantage in nearly every category even though it’s roughly the same size of chip, likely because it has almost double the power envelope of the 750 Ti. The GM107 will have to make much more effective use of resources and its power budget in order to match the R7 260X.

From a product perspective, the mismatches are even more striking. The R7 260X faces off against the GTX 750 at $119. The GTX 750 has just over half the peak texture filtering rate, half the flops, and half the triangle rasterization rate of the 260X. The R7 265’s edge on the 750 Ti is similarly unfair.

So… this ought to be interesting.

Test notes
I’ve gotten hooked on the speed and silence of SSDs, and I’ve converted nearly every computer I own to one. That includes most of the test systems in Damage Labs, but the first-gen 240GB drives on our GPU test rigs became limiting as game storage requirements increased.

Happily, the folks at Kingston stepped in and provided a pair of 480GB HyperX SSDs to solve that problem. We now use HyperX drives extensively in both our CPU and GPU rigs, and we’ve had very good luck with them. Also, the storage subsystem is almost never a meaningful bottleneck in any of our tests.

To generate the performance results you’re about to see, we captured and analyzed the rendering times of every single frame of animation during each test run. For an intro to our frame-time-based testing methods and an explanation of why they’re helpful, you can start here. Please note that, for this review, we’re only reporting results from the FCAT tools developed by Nvidia. We sometimes also report results from Fraps, since both tools are needed to capture a full picture of animation smoothness. However, testing with both tools can be time-consuming, and our window for work on this review was fairly small. We think sharing just the data from FCAT should suffice for now.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Our test systems were configured like so:

Processor Core i7-3820
Motherboard Gigabyte
Chipset Intel X79
Memory size 16GB (4 DIMMs)
Memory type Corsair
Vengeance CMZ16GX3M4X1600C9
DDR3 SDRAM at 1600MHz
Memory timings 9-9-9-24
Chipset drivers INF update
Rapid Storage Technology Enterprise
Audio Integrated
with Realtek drivers
Hard drive Kingston
HyperX 480GB SATA
Power supply Corsair
OS Windows
8.1 Pro
core clock
GeForce GTX
650 Ti
GeForce 334.69 beta 928 1350 1024
GTX 650 Ti Boost
GeForce 334.69 beta 1020 1085 1502 2048
GTX 750
GeForce 334.69 beta 1033 1111 1253 1024
GTX 750 Ti
GeForce 334.69 beta 1072 1150 1350 2048
R7 260X
14.1 beta
1100 1625 2048
R7 265
14.1 beta
925 1400 2048

Thanks to Intel, Corsair, Kingston, Gigabyte, and OCZ for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.

Also, our FCAT video capture and analysis rig has some pretty demanding storage requirements. For it, Corsair has provided four 256GB Neutron SSDs, which we’ve assembled into a RAID 0 array for our primary capture storage device. When that array fills up, we copy the captured videos to our RAID 1 array, comprised of a pair of 4TB Black hard drives provided by WD.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

In addition to the games, we used the following test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Crysis 3

You can see from the plots that the frame rendering times on each card are generally pretty smooth, with a few exceptional spikes that seem to happen on all of the GPUs. That’s where I’m blowing dudes up with an explosive-tipped arrow. Takes some time to compute the awesomeness of that.

As you can see, the FPS average and our latency-senstive metric, the 99th percentile frame time, are pretty much the inverse of one another. That’s usually an indication of health—or at least relatively smooth frame delivery overall.

If you were expecting the GeForce GTX 750 Ti with its 128-bit memory interface to match the performance of the Radeon R7 265 and its 256-bit memory, well, you’re not witnessing that particular miracle here. Instead, the miracle you’re witnessing is the GeForce GTX 750, a gimpy version of the GM107 with a 55W power budget, nearly matching the performance of the Radeon R7 260X, which has a fully enabled Bonaire GPU and a 115W TDP. Fairly impressive when you think about it.

Hmm. This “time spent beyond 50 ms” result may be the most interesting one here. We use this metric to quantify “badness,” or the amount of time working on frames that take a particularly long time to produce. Those are the frames that tend to make the game feel less-than-smooth. What’s unusual here is how the Maxwell-based GPUs manage to minimize those hiccups. We might usually chalk up the difference to Nvidia’s graphics drivers, but we’re using the same driver revision on the GTX 600-series cards, and they’re no better off than the Radeons. I’m not sure what to make of that fact. Could we be seeing some nice property of this new GPU architecture in action?

Borderlands 2

All of these cards handle my favorite game quite well at 1080p with the quality options cranked. The “time beyond” 33 and 50 ms graphs are wastelands with tumbleweeds blowing by. The GTX 650 Ti Boost does run into a few frames that take about 30 ms to render, but that’s hardly real trouble.

The pecking order here is more or less what we established in Crysis 3. The R7 265 and GTX 650 Ti Boost are easily the fastest overall, with the GTX 750 Ti trailing them. The R7 260 and GTX 750 are then neck and neck, with the GTX 650 at the back of the pack.

Tomb Raider

Wow. There isn’t a stray frame time spike in any of the plots, and the latency curves for all of the cards are just pristine. Our test scenario doesn’t really push the envelope with lots of explosions and combat, but this is the sort of competency we’d like to see out of all games and graphics cards.

There’s not much drama otherwise. The GTX 750 Ti trails the R7 265 and the 650 Ti Boost by a little less than usual, but the GTX 750 trails the R7 260X by a little more.

Battlefield 4
Since the FCAT overlay isn’t compatible with AMD’s Mantle API, we used the frame time capture tools built into the latest versions of BF4 for these tests.

AMD told us when the Mantle-enabled version of BF4 came out that performance wouldn’t scale as well with its lower-end graphics cards just yet. They weren’t wrong. Mantle is barely a net win for the R7 265, and it’s a net loss for the R7 260X.

What interests me most about these results is how the GTX 750 Ti performs relatively well, matching the beefier GTX 650 Ti Boost, while the GTX 750 suffers in comparison. Something about the loss of the shader and texturing power of that one disabled SM on the GTX 750, or perhaps the somewhat lower memory clock, has a big impact on frame rendering times.

Arkham Origins

Welp, the average FPS and 99th percentile frame time results do not mirror one another. Check the R7 cards’ frame time plots, and you’ll instantly see why. There are lots of little spikes in frame production times throughout. The latency curves also tell the story, and it’s not great. Even though the R7 265 averages 79 FPS, it’s effectively slower here than any of the GeForce cards. You can feel the difference while playing, believe me.

Arkham Origins is one of those games that Nvidia has sponsored, so it includes a number of enhanced visual effects based on Nvidia’ GameWorks suite. We’ve seen this kind of thing in the past, where a game runs best on one brand of graphics card when the developer has worked closely with the GPU maker. Several things about these results frustrate me, though. For one, you’ll notice that nearly all of the “DX11 enhanced” effects that Nvidia helped build into the game are disabled for this test. I was trying to maintain a fair playing field while keeping the image quality settings reasonable for this class of card. Doesn’t seem to have helped. Also, Arkham Origins is a very big-name title, and it’s been out for months. I’d have expected AMD to have gotten it running smoothly by now. Obviously, that’s not the case.

I had hoped to include another game in our test suite, Guild Wars 2. However, due to an apparent bug in the Catalyst 14.1 beta drivers, I wasn’t able to get disable vsync in that game, so I had to abandon it. Grumble.

Power consumption
Please note that our “under load” tests aren’t conducted in an absolute peak scenario. Instead, we have the cards running a real game, Crysis 3, in order to show us power draw with a more typical workload.

Although the Radeon R7 260X and GeForce GTX 750 offer comparable performance, our test system draws 40 fewer watts at the wall socket in Crysis 3 with a GTX 750 installed. The GTX 750 Ti is similarly efficient, drawing only four to eight watts more than the GTX 750. Maxwell’s power efficiency improvements are no joke.

Noise levels and GPU temperatures

I suppose I could have added an extra digit to the acoustic results, so you could see the fine-grained differences between the cards. Truth is, though, that nearly all of them are alarmingly close to the ~32 dBA noise floor for our test system (whose only other source of noise is a big, quiet Thermaltake CPU cooler) and for Damage Labs itself. I don’t want overstate the precision of our measurements.

These are, of course, very good results. You won’t perceive much difference between the noises produced by most of these cards in normal operation, with the obvious execption of the R7 260X. It’s a little unsettling that the Maxwell-based cards aren’t any louder when running a game than they are at idle. Even the rinky-dink reference cooler keeps the GM107 at a reasonable temperature without making any more noise.

Architectural efficiency
Now that we have some performance and power results, we can illustrate the efficiency of these GPU architectures by generating some of our beloved scatter plots.

In each of the plots below, the most efficient solutions will gravitate toward the top left corner of the plot, while the least efficient ones will tend toward the lower right corner.

We’re using measured performance and power from Crysis 3 here, since we have both sets of results. This is system-level power use, as shown on the previous page. Our key comparison from an architectural standpoint is the R7 260X, based on Bonaire and the very latest GCN technology, versus the GTX 750 Ti and GM107. As you’ve no doubt gathered by now, Maxwell appears to be much more power-efficient than the latest iteration of GCN, and not by a little bit.

Although area concerns have taken a back seat to power efficiency in Nvidia’s last couple of GPU architectures, the GTX 750 Ti wrings measurably higher performance across our suite of games out of a slightly smaller die area than the R7 260X.

Want to know the impact of Maxwell’s larger L2 cache? Compare the GTX 750 Ti to the GTX 650 Ti. They both have a 128-bit, 5.4Gbps memory interface, with 86GB/s of bandwidth on tap, yet the 750 Ti performs much better. Of course, things other than the cache size have changed, but memory bandwidth generally remains a major constraint. Also compare the 750 Ti to the R7 260X, which has 10GB/s higher memory bandwidth yet is slower overall.

As one might expect, the GTX 750 Ti is among the most efficient GPUs in each of these categories, which means it makes very effective use of its on-chip graphics resources. The most striking contrast here is in shader flops; in theory, the R7 260X nearly doubles the peak rate of the GTX 750, yet the two cards perform about the same in our suite of current games.

We’ve answered the architectural efficiency question. Now let’s look at raw value, in terms of price and performance.

Even though we’re using a geometric mean to average performance across the five games we tested, the Radeons’ unusually poor performance in Arkham Origins has an outsized impact on their overall 99th percentile FPS. I think an adjustment may be in order, so I’ve provided a separate 99th percentile scatter plot with the Arkham Origins results removed from the calculation.

In my view, the outcome here is pretty straightforward. Nvidia is apparently rather proud of Maxwell, and it has priced the GeForce GTX 750 Ti accordingly. The 750 Ti offers lower overall performance than the card it replaces at $149, the GeForce GTX 650 Ti Boost, and it’s also slower than the Radeon R7 265. Since Nvidia has killed off the 650 Ti Boost, the Radeon R7 265 captures the value title at $149. If you base your buying decision solely on price and performance, then the Radeon is the card to choose.

But the GTX 750 Ti can go places the R7 265 can’t. The reference GTX 750 Ti is 5.75″ long, doesn’t need an auxiliary power input, and adds no more than 60W to a system’s cooling load. The R7 265 is over eight inches long, requires a six-pin power input, and draws up to 150W of juice, which it then converts into heat. For many folks choosing between these two products, those factors may matter more than the difference in performance, especially since the GTX 750 Ti can provide a very nice gaming experience at 1080p. You’ve gotta think the GTX 750 Ti will be the animating force behind a legion of Steam boxes.

The Radeon R7 260X and the GeForce GTX 750 offer nearly identical performance at $119, but many of the same dynamics apply otherwise. The GTX 750 requires less power and doesn’t need a six-pin aux input, for example. Also, in this case, the difference in noise levels is huge. The reference R7 260X registers 47 dBA on our sound level meter, while every GM107-based card we’ve tested flirts with the ~32 dBA noise floor in Damage Labs. I’d call this an unqualified win for Nvidia, except that the R7 260X has 2GB of memory, twice what the GTX 750 does. That fact didn’t seem to harm the GTX 750 in our testing, but the Radeon is arguably more future-proof. Let’s call this one a qualified win for Nvidia and add an asterisk about the memory size.

In the larger picture, Maxwell’s arrival signals a big change in the GPU space for the coming year. The only way AMD managed to maintain a good position on our value scatter plot with the R7 265 is by offering a much larger chip, with double the memory interface width and more than twice the power budget, for the same price as some fairly lightweight hardware from the competition. That’s what happens when you lose the technology lead, as AMD has learned rather painfully in the CPU market in recent years. My expectation is that Nvidia will roll out a whole family of Maxwell-based products in the coming months. Those are likely to be much faster and more efficient than current Kepler-based cards. I’m not sure what AMD can do to answer other than drop prices. Heck, I don’t think we know much of anything about the future Radeon roadmap. AMD seemingly just finished a refresh with the R7 and R9 series. Looks like they’re going to need something more than another rehash of GCN in order to stay competitive in 2014.

Thank goodness I’m forced to be concise on Twitter.

0 responses to “Nvidia’s GeForce GTX 750 Ti ‘Maxwell’ graphics processor

  1. Don’t bother with this low-to-mid range GPU.

    If you can wait until the 20 nm GPUs appear at the end of this year, that’s when we should see some significant performance improvements.

  2. Should the owner of a OC’ed 560 Ti upgrade to the 750 Ti?

    Long version: I’m running a 4-year-old i5-750 rig. I beefed it over the years with more RAM, a SSD and a 560 Ti. But the mobo has been acting up lately and I think it’s time to get a complete replacement. My Core 2 Duo HTPC still works but the lady at home prefers to hook up her MBP to the TV. So I can reuse the still-shiny Silverstone SG-02 mATX case and transplant the SSD and 530W PSU over.

    My 22″ monitor is not even 1080p, and when it gives up its ghost, it will be replaced by a 27″ 1080p unit. My games are pretty outdated because I haven’t had time to clear my old games. The last games I finished were TES4: Oblivion and Dragon Age. Dragon Age 2 still sits on my shelf and I haven’t even bought Diablo 3 and Skyrim.

    Should I keep my slightly hot and noisy 560 Ti for a couple more years (probably until it dies), or grab the new 750 Ti?

  3. With just a $3 price difference between the GeForce GTX750Ti and a much-better-performing Radeon R9-270, the new Maxwell chip isn’t very attractive.

  4. Yep, pure paper launch.

    Newegg had a grand total of ONE vendor (Sapphire) and the R7 265 sold out immediately. As of today 3/8/2014 no further vendors are offering the R7 265 and no restock of the Sapphire R7 265 has happened.

    AMD’s wide spread availability at the End-of-February quote really meant please don’t buy that new shiny Nvidia GTX 750 Ti.

    AMD has turned into a Paper launch machine.

  5. I see that the price has now jumped $20 since launch so we got in at the right time.


  6. The R7 265 is not even available as of early March. Can we say: Paper, paper, paper launch.

    [quote<]For example, the 265 still hasn’t launched although it was targeted for a late February release. [url<][/url<] [/quote<]

  7. [quote<]Honestly as of right now I'd get an R7-265[/quote<] But as of early March you still can't get a R7-265. [quote<]For example, the 265 still hasn’t launched although it was targeted for a late February release. (An AMD spokesperson tells me to expect its availability later this week, but it’s worth noting that the majority of tech press hasn’t yet received the card.) [url<][/url<] [/quote<] So honestly I went and bought the readily available on launch day for $149.99 after $5 rebate: EVGA 02G-P4-3753-KR GeForce GTX 750 Ti Superclocked [url<][/url<]

  8. [quote<]Looking forward to poking around in their improved video decoder/encoder.[/quote<] Have you gotten any results to share?

  9. You can.

    KFA2 GTX 750 OC previewed

  10. Where or where is that R265 that reviewers used to make the GTX 750 Ti look not as good performance wise. It is early march and it still is MIA.

    [quote<]For example, the 265 still hasn’t launched although it was targeted for a late February release. (An AMD spokesperson tells me to expect its availability later this week, but it’s worth noting that the majority of tech press hasn’t yet received the card.) [url<][/url<] [/quote<] I believe that review(ers) should not use paper launched products that may not even sell at the paper launch wishful price for comparison to new readily available cards.

  11. If this chip is such a marvel of low power consumption, why can’t we buy low-profile cards?

  12. I agree that there should be at least two previous gens “popular cards” represented as those are more then likely the cards that people would be upgrading from. It would give them some data to look at to see if an upgrade is warranted yet.

  13. And if people stop reading because time isn’t being taken to add these?

    I don’t think he’s asking for every card in existence, just the notable ones. I’ve also noticed this trend, but have not said anything recently as when I have it seems like I’m making TR out as this horrible place and get a lot of hate for it.

  14. But who’s to say what is and is not sustainable?

    Personally I think Mantle will eventually go the way of Glide and *coin will be around long after it’s gone. At best it’ll be relegated to something like PhysX is now. Nice feature to tick off a box with, but ultimately irrelevant.

  15. More cards means longer testing, more time, less content outside of these reviews and so less revenue generating content. So it costs more to do it, and just unfortunately might not be worth it due to the constraints.

    That being said you can usually check a variety of reviews to “connect the dots” and trace the performance back relative to the previous gen one you are looking for. Just a little bit more work for the reader.

    I’m sure they want to do it, and in a perfect world they would test every new card against every old on in existence. Sadly it’s not a perfect world and it’s full of compromise :/

  16. Techreport,

    I began reading your site because I loved the great cross generational comparisons you’d do in your benchmarks. That being said the 7950 was one of the highest recommended cards from you last generation can you never cover it in your current bench marks, this is a single example of a very pervasive issue. Your cross generational bench marking is spotty at best these days as the number of cards you fit into a single article has reduced to 1/4 of what it used to be.

    Please consider this when putting together future articles.

  17. Lol they’re already going crazy with pricing like Intel.

    I wouldn’t expect them to need 400w, remember Maxwell consumes much less power than Kepler. I think they’ll hold the line at 250w for GM200. Heck, I WANT them to, and just throw in tons of transistors. 😀

    Imagine the glory of a full 8-core Haswell-E, with 32GB of DDR4 with a crazy uncompromising GM200 based card!

    I just hope Nvidia drops this whole “Titan” crap.

  18. Miners don’t use Mantle or True-audio.

    When game developers look at Steam Hardware survey, they’re not going to see the mining GPUs represented, and thus AMD’s GPUs will be underrepresented.

    As a consequence, the developers will have less incentive to use those features.

    There’s a reason why not many software use AVX2 compared to SSE2. One of them already has mass adoption; the other one doesn’t, yet.

  19. You are missing the point, Maxwell has some good architectural changes. In the semiconductor world if you can reduce power by 20-30% but maintain or increase performance its a win. Nvidia will ride Maxwell into their high end in the next 6-8 months and we don’t really know what AMDs answer will be, that’s the point of the last page. It is possible AMD will even things out with a faster shrink to 20nm but this is still an unknown. AMD is seriously resource constrained by trying to take on Intel and Nvidia on their core business, this much is becoming clear.

  20. newegg (Canada) was selling for a short while an ECS GTX 660 for $150 and a GTX 660Ti for $225. I had a eVGA 460 and decided to pick the 660 up. It’s reference design with 2 GB memory, stock clocks. My only experience with ECS was a K7S5A motherboard that I bought years ago and is still running for a neighbour.
    So far – a great video card. Runs idle at 30 degrees and Furmark and Ungine Heaven 4 will get up to 74 degrees without making a lot of noise. It looks like a great bargain.

  21. Generally a company’s aim is sustained profit growth. Mining is potentially offering them a boost right now assuming they aren’t supply constrained anyway. But it may well be a fad which is a dangerous thing to plan around when estimating future demand and therefore production.
    If Mantle and HSA are successful that should offer a more stable platform to grow on.

  22. Doesn’t matter, IMO.

    AMD’s ultimate goal is to sell cards, and their initiatives toward HSA, Mantle or whatever else are just means toward that end. If *coin gets their cards 100% sold out then it’s actually better for AMD than any number of developers adopting Mantle.

  23. Nvidia is going for per/watt and that helps them in server, pro, laptop , mobile, cars ( this is a big deal if they do the compute on self driving cars, it’s serious volume, high margins and each design wins generates sales for a few years) and a lot less in desktop. So low power helps them in almost all segments including compute.Nvidia makes most of it’s profits from pro and server so no way they are giving that up.
    AMD changed leadership , fired a lot of people and most likely we are yet to see what their strategy actually is. Takes a few years to implement a new roadmap and for now they are just trying to stay alive.

  24. Depends a lot on how power scales on 20nm ,they can’t really go 400W or something. And it’s not only AMD , it’s Intel now too vs Tesla since if they do make a huge chip ,Tesla is the primary target.
    Do remember though that AMD could have a good chip too and that wouldn’t be bad at all, we wouldn’t want Nvidia to have room to go crazy with pricing ,like Intel.

  25. Here’s the main link to the full TechPowerUp review so people can come to their own conclusions:
    [url<][/url<] Jihadjoe accurately pointed out that theLionSpig was playing rather fast & loose with his results and maybe you should look at the full review before jumping to conclusions.

  26. Ford can easily ramp up productions, nor have they ever experienced a situation where demand spiked to a point where car dealerships can get away with doubling the original launch price and still sell out of the trucks.

    AMD can’t readily increase productions, because any excess fab capacity in TMSC or GF would’ve already been used. And building more silicon fab plants is a lot harder than another assembly line. By the time they do increase productions, the mining bubble might’ve already burst.

    Also, AMD needs high market share among potential users for HSA and Mantle to gain adoption.

    But miners could give a damn about those two features, and with them going “omnomnom” on all of the 290 cards, game/software developers don’t see a need to support either features due to “low market share”.

  27. TPU (and others) disagree.

    The 750 Ti is far and away the most performant GPU on a per-watt basis.


  28. Haswell had some nice efficiency. Though I heard it was terrible at OCing and thus wasn’t worth upgrading from Sandy Bridge (for desktops).

  29. The [u<]7790[/u<] competes against 750 TI in terms of performance/Watt pretty nicely: [url=<]-Hitman Absolution [/url<] [url=<]-BF4 [/url<] [url=<]-Metro Last Night [/url<] [url=<]-Crysis [/url<] [url=<]-Crysis 3 [/url<] And this is while consuming few Watts less than the 750TI: [url=<]-Power consumption: 7790 vs 750 TI [/url<]

  30. Those efficiency gains will eventually be useful for speed when the bigger chips come along.

    Remember that GPUs as they are today are pretty much bound by a 250W upper limit where they’ve been for the last couple of generations, so it takes efficiency in order for any speed to be made at the top end.

    Small Maxwell being 2x as efficient as small Kepler means that later on, GM110 can be twice as fast as GK110.

  31. See, even Nvidia wants their cards to be used for mining.

    This “mining is bad for AMD” speculation stuff just has to end. It’s “bad” for them in the same way that selling every F150 to dude bros is bad for Ford. Sure it takes away from the market who actually want to use the truck to do work, but at the end of the day Ford makes its money and doesn’t really care either way.

  32. I got correction for your post:
    Mantle – letting developers to fix our drivers for us and thus shifting most of costs from us elsewhere.(while allowing us to blame DX as if it was problem and not drivers)

  33. Just note:
    ZeroCore would be of use in systems like mine, which are running 24/7, where interaction occurs over day, while at night monitors are turned off. (Half workstation, half server)

  34. Yeah, you’re probably right. In fact, i noticed an interesting thing, Core 2 was essentially a mobile design derivative (but upscaled for the desktop), and Intel reorganised that structure and overhauled that into Nehalem. Then it further optimised for efficiency and performance both with Sandy Bridge.

    With Fermi, Nvidia was chasing compute, and was their Nehalem. Though i feel Maxwell more of Nvidia’s Sandy Bridge with some Haswell paint (mobile first approach).

    Kepler gave up too much compute, I think they’ve improved that considerably with Maxwell (like how GM107 was trading blows with GK104 in some benchmarks), thanks to its increased L2, among other things.

    I think the pattern here is very evident. Intel and Nvidia realised that they have very high performance architectures, more so than their ARM counterparts. However, ARM folk get the work done on an efficiency basis.

    Our PC giants realised that if they took their architectures and refocused on optimisation at various levels, they could run over ARM eventually. Obviously, Intel seems to be leaning more on its fabs, but Nvidia is obtaining crazy levels of efficiency [i<]and[/i<] raw performance every generation. Actually, Intel could have as well, if they hadn't shifted focus to APUs. Nvidia doesn't seem too concerned about smaller die sizes, which is how they're getting both performance and efficiency. Though of course, they have embarrassingly parallel workloads to deal with, as opposed to Intel. AMD refuses to make a mobile-first architecture so far, and no Jaguar doesn't count. Or maybe it does? Maybe they have to simply make a bigger version of Jaguar? I don't know.

  35. Took the plunge and ordered myself the Evga 750 Ti SC (had a bunch of NCIX gift cards to use up). Looking forward to poking around in their improved video decoder/encoder.

  36. You can tell how good a new architecture is by how desperate fanboys of the other camp are. By that criteria so far Maxwell looks excellent; at least at 60W and below.
    These made my eyebrows raise in full on Roger Moore style albeit not tessellated. 🙂

    “Overall efficiency surely looks good but component choices can also play a part there. There was a group test of motherboards somewhere and with them all running the same CPU, RAM etc. there can still be a 50W difference from the choices for VRMs etc.”

    “But so far most of the ‘reviews’ have been fairly shallow and a lot of them seem to following Nvidia’s PR ‘review guidelines’. Did I mention that I don’t trust PR and Nvidia’s PR department is one of the shadier ones at that?”

    “But while the Bulldozer ‘bull’ was just the usual PR nonsense (‘of course our product will be great’), what really irks me with Nvidia PR is that they not only make outlandish claims all the time, especially the CEO, with their constant ‘Tegra x will be super’ etc., but rather the way the press is so sycophant that they never question anything in that Nvidia send with their ‘press packs/review guidelines’.”

  37. Was it 50% more efficient at load in terms of performance per watt than anything else?

    Zero Core is great for systems that are left on permanently with little user interaction.
    But if you have a system used in that way maybe consider not using a dGPU?

    What happens beyond GTX 750 Ti is speculation on both sides.

  38. 1. HSA – Promising but let’s see how it plays out.
    2. Mantle – Promising but let’s see how it plays out.
    3. GTX 750 Ti – The actual GPU is 50% more power efficient than ANY other GPU close to its class right NOW.

  39. The amount of attention this new release from Nvidia’s has gotten is blown out of proportion in every way.
    The following is TR’s take on each of the new technologies that have come out in the past few months:

    HSA: A hardware implementation that allows GPUs to share memory and execution of processes with x64 CPUs without kernel intervention.
    TR’s impression: meh

    The Mantle API:
    -Both high and low level API
    -Reduces CPU overhead by +/60%
    -Allows developers to write games once that can run on both the new consoles and half the PCs, which makes it more cross platform than Direct3D
    TR’s impression: meh.

    The 750TI: Consumes 30 Watts less than its main competitor, the R265, but is still 10%-15% slower depending on the game while being more expensive.
    TR’s impression: The greatest thing that’s happened this year

    Am I missing something or the site is starting to feel more like a pregnant widow?

  40. I dunno. Look at how much of an improvement Kepler was over Fermi in the same way. GK104 is smaller than GF114 and yet massively more efficient and faster. They have been focusing on power and efficiency forever, with the biggest blip being G80 (it essentially lacked power management!)

    AMD has some interesting things going on like zero core. Excellent idle power characteristics. I think they will make further strides with whatever they will put out on 20nm and not be far off of Maxwell. Whether AMD can execute on drivers properly for a change is another matter though.

  41. I don’t think that’s the case.
    The closest to Maxwell maybe when Intel moved from Netburst to Core 2 Duo.
    Conroe (1st generation C2D) was a mobile architecture repositioned and tweaked for the desktop.
    This time nVidia have deliberately designed an architecture for mobile which scales upwards.
    The different variants of Maxwell will have a different emphasis depending on positioning but overall the key metric was power efficiency.

    It will be interesting to see how well it scales above and below the 750 Ti level for desktop and mobile.
    Easily the most interesting architecture I’ve seen for ages.

  42. I’m with you in the 460 club. Trying to figure out the last GPU I will ever buy for my i7 920 full-size tower.

    But the 760 is crippled too, we have to step up to the 770 to witness the power of this fully enabled chip! (read as Emperor Palatine, you know the line). It’s pricey, nvidia is sure getting their benjamins with these – but you know it would be worth it. 🙂

  43. AMD talks a lot about compute because right now they have an advantage there. Will Maxwell change that? Eh, I don’t know. It’ll be interesting to see.

    But the difference between AMD and nVidia in intentions is the same as the difference between AMD and Intel, respectively. AMD is trying to make do with what they have. What they have is not ideal for this newly mobile, newly performance per watt (with an emphasis on the per watt) world, but they’re going to trudge on through because they don’t have the money or manpower to make new architectures to replace either their CPU or GPU designs.

    Thus, they’ll keep kludging what they got into one product, APU’s, and keep screaming how it’s the future. Unfortunately, they were close to what the future would be (CPU’s + GPU’s), but way back when they made the choices that got them to Bulldozer and GCN, they were wrong about what products these APU’s would be going into. They thought it would be small computers, but instead it’s mostly into tablets and smartphones. Back then, they had a vision of people buying smaller desktops and not having to buy separate CPU’s and GPU’s. For PC’s, people are fine buying discrete GPU’s. Even for laptops, discrete GPU’s work fine. It’s tablets and smartphones that really sell the concept. Or NUC-sized PC’s, maybe, but that’s still very niche.

    As a result, they made decisions about performance per watt that requires a certain amount of power that they now won’t have. As if that wasn’t bad enough, AMD then found their designs provided far less performance than expected at less power levels, so they had to “juice” them up with more power than they’d have liked just to get them up to the levels where they’d match (or in the CPU space even remotely come close to matching) their far more agile, focused competitors. Then they had to drop prices just to stay in the discussion because of the ballooning power requirements and subsequent heat/cooling consequences that went along with that.

    So AMD’s focus on computing is a result of their not being suitable in any other class where AMD had hoped they could be. That’s a temporary reprieve while Intel and nVidia adjust their strategy. Think about way back. AMD had the mid-range. Intel eventually swarmed over it to take it when their sales started to drop. Then AMD was secure with the mid-low end. Eventually, Intel–like Galactus–screamed, “WE ARE HUNGRY!” and consumed that space, too. As Intel’s stranglehold on all computing begins to wane, they’re taking more and more of AMD’s pieces of the pie, as small as they are.

    nVidia was once the power hungry, huge-die sporting kid that showed up running hotter than people liked. Then with Kepler they completely switched gears and became the performance per watt king. It didn’t hurt they showed up sporting superior performance than the GCN products of the time with the drivers at the time. (AMD did improve the 7xxx series a few months later, but a lot of damage had been done.) Even so, nVidia still has the highest performing part, too. They got the low end, the performance per watt end, AND the high performance end, too.

    So again, like with Bulldozer, AMD responds by juicing a GPU to the extreme, pricing it lower because of the consequences of a high voltage, high temperature part with a cooler built originally for the 5xxx series, and using that as their response.

    TLDR; AMD isn’t compute-focused because that’s their vision of the future. They’re focused there because that’s all they have left after Intel and nVidia have gobbled up just about every other part of their respective marketplaces. Expect nVidia to storm the compute beachhead and Intel to gobble up the last of the APU holdouts with more pervasive use of Iris Pro.

  44. I don’t think its TR emphasizing efficiency as much as tying in a preview of things to come with the new Maxwell architecture. These articles take a long time to write, so maybe it was instead of or a primer to a longer article that talked about the architecture but didn’t have any particular cards it was talking about.

    I understand it seems a little out of place in a “review” article of a graphics card, but it is relevant information that I found very interesting and useful. Now I know to wait and see what comes out next, if I have the patience.

  45. And, they could keep the die area the same and go completely crazy with transistors. Heck, they could cram 30 SMMs into the 550mm[sup]2[/sup] that GK110 takes up at 28nm itself, probably get in another GPC or two at 20nm.

    I did some math, we’re looking at a number over 3824 regarding the ALU count of GM200. That’s at least 8 billion transistors.

    Of course, i’m assuming Nvidia decides to go full crazy and runs after AMD with all guns blazing.

  46. fhohj mentioned that maybe this will be Nvidia’s “Haswell”, in that there are radical advances in power efficiency and/or performance in the low end, but the high end won’t change much.

    On one hand, I hope this is the case. AMD really needs a break. Maybe the efficiency of the 750 Ti will scare them into creating great things with the R9 390x, while the performance efficiency of the 750 Ti never ends up being realized on GTX 880. AMD could really use a win, and I want them to have one for the sake of competition and so that consumers win, not uncompetitive corporations.

    On the other hand, I want “big” Maxwell to be a home run, partially because 4k monitors need the muscle, and partly because I know which proprietary feature I value more: G-Sync over Mantle, definitely.

    Short term, I want a $200-250 GPU in 6-12 months that will smoke my HD 5870 and that I can use with variable refresh rate. Long term, I want competition in the high end CPU and GPU spaces so Intel and Nvidia don’t stagnate. Because I want a brighter future.

  47. I see it more as this is a new architecture and every new design has new power management techniques and efficiency gains. Lots of precedent here.

    AMD’s power management / efficiency concerns would be in APUs, discrete mobile and in getting the most from the practical power limits at the high-end. I have no doubt that their new architecture will have spiffy new ideas too.

  48. That’s not really the issue. Maxwell was DESIGNED for low power and to scale upwards whereas previous nVidia and all AMD designs are seemingly designed more for performance and to scale downwards. So the mobile parts such as Radeon 8870M and previous generation nVidia designs are compromised by that.
    For desktop Maxwell is a slam dunk in power efficiency at 750 Ti performance levels so it will be interesting to see how it performs for mobile chips and for performance parts especially at 20nm.

  49. What a stupid comparison in that link.
    It’s about 13% faster @ 1080P according to this:
    [url<][/url<] Considering the much higher wattage and noise figures that leaves AMD once again fighting on value only which is not the only metric that matters at least to me.

  50. Performance per Watt @ 1080P:

    GTX 750 Ti 100%
    GTX 650 Ti 65%
    GTX 650 62%
    GTX 660 57%
    GTX 650 Ti Boost 55%

    [url<][/url<] So the GTX 750 Ti is over 50% more efficient than all the nVidia cards relatively close to it performance wise and likewise for AMD cards.

  51. I don’t know about deadly, but I suspect it’s a very efficient combination.

    (And probably the combination of my choice, as much as I want to support the underdog.)

  52. I read that as [quote<] "...picked up the GTX [b<]780[/b<] Ti as an upgrade for my HTPC..."[/quote<] and thought, you sir/mam have a lot of money to throw around!

  53. I’ll link this page again because it’s relevant to your question – look for the table about 1/3 down the page titled GPU Power Consumption. They compare the 750 ti to the 650 ti Boost.

    [url<][/url<] From the article: "The GTX 650 Ti Boost hits 65 FPS on 161W of power. The R9 70 pushes 72 FPS on 172W. The GTX 750 Ti’s still-impressive 64 FPS come courtesy of just 121W of power consumption. That’s 98% of the GTX 650 Ti Boost’s performance for 72% of its power consumption."

  54. I am really glad that Nvidia is developing the entry-level segment. However, there are a lot of good graphics cards around 150$ and some of them, like [url=<]Radeon R7 265, are significantly faster and are better for 3D rendering.[/url<] I agree that a price cut would make these products more attractive.

  55. Going to be updating an old alienware x51 with an 240 psu; really digging that it doesn’t need a 6 pin plug.

  56. [quote<]That would happen with any low-mid tier GPU.[/quote<] Please name the ones that only use PCIe power (no 6-pin power), can work in systems with only a 250 watt power supply and perform in any way decent with recent games.

  57. i’m hoping the power efficiency allows for Big Maxwell to have some [i<]base[/i<] clocks higher than 1GHz Edit: I realize now that some of the mid-range parts have >1GHz clocks, but the high-end pieces (780 Ti, 290x) seem to stay around 800-900 MHz. So, here's hoping for 1.25GHz base clock speeds 🙂

  58. Meadows didn’t know what a barrel shifter does, so I explained what it does (shifts bits left or right), and gave an example of how that capability can be used (multiplying/dividing by powers of 2).

  59. It really would be useful to see the power consumption/efficiency compared with a GTX 650 as well as a 650Ti. The GM107 is a lot more comperable class of chip to the GK107 powering the 650, as opposed to the GK106 powering the 650Ti (the same chip used in the 660, capable of double the performance and power consumption)

    I know the GM107 takes a page out of the GK106’s book with a huge increase in shader cores, but the rest of its capabilities are far closer to just an iterative update of the GK107.

    I suspect the power efficiency improvements aren’t quite so pronounced compared to that chip, as it was already pretty lean. Just would be good to see a direct comparison.

  60. I’m not sure I’d go 265. I’m curious about how the Asus 750ti will OC with the 6 pin and wonder if it has the same TDP limit imposed by Nvidia. For someone who’s been waiting for the perfect APU to come along I think Nvidia may have just made the effort moot. I’d love to see a single slot passively cooled version of these for near silent HTPCs.

    I whole-heartedly agree that if I was AMD, I’d be scared about Maxwell’s potential with how these things are performing at this power level. With 250W to use, along as it scales well, the high-end of these Maxwell cards are going to really be something.

  61. That would happen with any low-mid tier GPU. The impressive part is that a gamer at that level is budget minded enough that replacing the power supply is a serious concern. Not having to do so makes the 750ti a steal over the 250 or 260X + PSU depending on the sales deals you can find.

    I think this is why I’m seeing a small but noticeable amount of complaints about the 750ti not supporting SLI.

  62. AMD has their own mobile parts i.e. 8870M rivals 7770 in performance and it consumes less power.

    7870M consumes about 32 watts and has been tweaked/renamed as 8870M.
    [url<][/url<] 2X 8870M roughly similar to 7950M which is about 50 watts.

  63. [quote<]Honestly as of right now I'd get an R7-265 if 1. You aren't obsessed with needing low power; 2. If the R7-265 is actually available at the stated price and not a coin-inflated price; and 3. You are using Windows.[/quote<] Why not get one of the factory overclocked GTX 750 Ti's like the EVGA 02G-P4-3753-KR GeForce GTX 750 Ti Superclocked 2GB ( and call it a day?

  64. or the EVGA 02G-P4-3753-KR GeForce GTX 750 Ti Superclocked 2GB

    [url<][/url<] Core Clock: 1176MHz vs 1020Mhz +15.3% Boost Clock: 1255MHz vs 1085Mhz +15.7%

  65. I would wait for the 20nm Maxwell versions as those will be both more powerful and be even better in performance per watt.

    I only picked up the GTX 750 Ti as an upgrade for my HTPC that had an aging GTS 450 in it.

  66. What do you mean “artificially high”?

    I just picked up a EVGA 02G-P4-3753-KR GeForce GTX 750 Ti Superclocked 2GB from Newegg for $149.99 after $5.00 rebate card.


  67. Maxwell made OEM desktop upgrades as simple as possible. PC Perspective looked into it and an i5 and A10 system (heck even a Pentium) had a lot to gain gaming-wise just with the addition of a 750 Ti, without doing anything beyond putting the card on the PCI-E slot (that is empty anyway because those systems were using integrated graphics). And those systems typically have 300-350W PSUs as well…

  68. Fry’s has a 750Ti card for $179 after a $10 MIR.

    Edit: And one for 159 after a $10 MIR at the egg. Nice dual fan EVGA card, too.

  69. I get the feeling that AMD and Nvidia have completely different goals.

    – AMD: develop a single, scalable architecture for gaming and computing so that it can be used for the desktop, APU/HSA and the pro market
    – Nvidia: develop a single, scalable architecture for gaming and *mobile* so that it can be used for the desktop, table, mobile etc

    Nvidia has clearly said they want to merge the “mobile” and “gaming” lines at the development level, something which makes financial sense if you believe they can capture any part of the mobile market. Tegra has failed to do this until now, but Nvidia still has their eyes fixed on the target…

    AMD is still going after compute, not after mobile.

  70. Scott mentions* that GF114 uses vec32 units (suggesting 1 vec16 + 1 vec32 to give 48 ALUs), [url=<]old AnandTech article[/url<] (680 review and Kepler intro) suggests 3 vec16 units, out of which one is FP64 capable. So...which one is correct? If GF114 did have 1 vec16 and 1 vec32 unit, it would have done 3 warps per GPU clock tick, compared to the original Fermi's 1. If AT is correct, it could do 1.5, which seems more plausible. also, consider this: [quote<] The SM schedules threads in groups of 32 parallel threads called warps. Each SM features two warp schedulers and two instruction dispatch units, allowing two warps to be issued and executed concurrently. Fermi’s dual warp scheduler selects two warps, and issues one instruction from each warp to [b<]a group of sixteen cores, sixteen load/store units, or four SFUs[/b<]. Because warps execute independently, Fermi’s scheduler does not need to check for dependencies from within the instruction stream. Using this elegant model of dual-issue, Fermi achieves near peak hardware performance. Most instructions can be dual issued; two integer instructions, two floating instructions, or a mix of integer, floating point, load, store, and SFU instructions can be issued concurrently. [b<]Double precision instructions do not support dual dispatch with any other operation.[/b<] [/quote<] GF114 doubles the SFUs and maintains 16 load/store and has more than one vec16 unit assigned per warp scheduler per clock, so i'm not sure at all how this was done. Possible one warp got scheduled to a vec32 unit (with 4 SFUs) and the other to the FP64 capable vec16 (with 4 SFUs)? *In the GTX 680 review: [quote<] (Incidentally, the partial use of vec32 units is apparently how the GF114 got to have 48 ALUs in its SM, a detail Alben let slip that we hadn't realized before.)[/quote<]

  71. I see. Appreciate the discussion! Thanks.

    p.s. I think the larger L2 may be targeted more at GPGPU stuff than graphics. From the Fermi whitepaper:
    [quote<] One of the key architectural innovations that greatly improved both the programmability and performance of GPU applications is on-chip shared memory. Shared memory enables threads within the same thread block to cooperate, facilitates extensive reuse of on-chip data, and greatly reduces off-chip traffic. Shared memory is a key enabler for many high-performance CUDA applications. ... For existing applications that make extensive use of Shared memory, tripling the amount of Shared memory yields significant performance improvements, especially for problems that are bandwidth constrained. For existing applications that use Shared memory as software managed cache, code can be streamlined to take advantage of the hardware caching system, while still having access to at least 16 KB of shared memory for explicit thread cooperation. Best of all, applications that do not use Shared memory automatically benefit from the L1 cache, allowing high performance CUDA programs to be built with minimum time and effort. [/quote<] Yup: [quote<] Fermi features a 768 KB unified L2 cache that services all load, store, and texture requests. The L2 provides efficient, high speed data sharing across the GPU. Algorithms for which data addresses are not known beforehand, such as physics solvers, raytracing, and sparse matrix multiplication especially benefit from the cache hierarchy. Filter and convolution kernels that require multiple SMs to read the same data also benefit.[/quote<]

  72. But the largest performance improvements come from the eDRAM cache, which is much bigger than the L2 in Maxwell (compare Iris with Iris Pro)…

    BTW wasn’t Maxwell supposed to have a stacked DRAM cache or something? Or was that for Volta?

    Unified virtual memory for Maxwell, stacked DRAM for Volta.

  73. I double checked, you’re right, you haven’t written SM anywhere in the relevant text…i’m not sure, but it’s possible i was taking one “quad” to imply a group of four units, or one SM, instead of one CPU-core-analogue.

    Sorry about that. :/

  74. Too much explanation??? Are you for realz? Not everything can be covered by article…

    As for second: Bad idea.

  75. That’s some impressive efficiency, and I love efficiency, but it concerns me that the 750Ti is notably weaker than the card it’s supposed to replace–especially since there’s a big price jump to the next tier. I want a Maxwell that compares well against the 760, and they will no doubt deliver, but at this rate AMD is going to own the $100-$200 GPU market.

  76. That’s plain paranoid. And to think that because some little user base thinks different they have to be clones just speaks about some TR users

  77. Definitely shoot for a GTX 760 – the 660 and 650Ti Boost can only access 1.5GB of the VRAM. Any time I go over, I get major stutters or fps drops.

  78. True.

    However a die is typically square in shape and is generally understood that its size is an area, since length and width were not explicitly stated and l*w=A

    Also 148mm would be a little long for a GPU die.

  79. I guess the point to be made is that nVidia is getting the most out of Maxwell right out of the gate, which is why I’m interested in bigger Maxwell parts.

  80. Hey I’m totally serious. You know what, this is one of the better tech buys I’ve ever made. The longest a video card has ever lasted me, at least. I bought it in early 2011 on sale for $169 and it’s been a great card for me. Three years is fantastic, IMO.

  81. Re: overclocking. Interesting. True, you could also overclock the 650Ti-B (or 660), but judging from the power usage, at its max OC the 750Ti only bumps the needle up by about 10W over its stock-clocked wattage. You’re still way, way, way under 660 power levels. So yeah, OC’ing the 750 seems minimally painful in terms of heat/power/noise.

  82. Took some looking, but I finally found it. On Galaxy’s website they only show a low-profile version of the 750 (not the750 Ti, despite what the pictures on other site show). At least so far.

    It looks like the Zotac 650 low-profile card in size, though — 1.5x slot width, maybe 2x. If you’re in a tight space, you may not have room for that much width. (So the 7750 is may still be the highest powered card you can get in a true single-slot.)

    Edit: forgot to say thanks for pointing it out! Will keep an eye out for it.

  83. Well, I don’t think there’s much argument that the value proposition of a crippled chip is often quite good, because it often results in a bigger drop in price than the drop in performance. (Or the converse; they charge a premium for that small boost.) I have to agree, though, that there’s a difference between crippling it because you have to (yields) than just because you can (segmentation). It gets me right in my sense of efficiency. That’s my inner idealist/perfectionist talking, though.

  84. Of course the reviews can be compared to one another. Heh, the fact that they are different is what makes them interesting to compare.

    My original comment garnered a lot more attention that I expected it to. To rephrase that original comment, Nvidia has cause a distinct shift in the conversation with the GTX 750 cards, and I find the nature of that shift to be interesting to consider. It probably seems unfair that I’m pitting two TechReport reviews against each other in order to examine that shift, but, heh, that’s just the material I have to work with. It’s the best material out there, too; I wouldn’t bother doing the same with any other website.

  85. Nvidia seems to have leapfrogged AMD on the efficiency front with this one. Don’t get me wrong; GCN was awesome when it came out, but when was that… two years ago? AMD has pretty much sat on their laurels ever since (remember when they announced that there won’t be any major refresh to the architecture about a year after GCN came out?), and now Nvidia has shot past. The 290X is obviously making a killing with cryptography but it pretty much holds the line in efficiency, not move it forward.

    I’m not gonna defend AMD here, but I guess it’s hard to compete on two fronts with an 800-pound gorilla with gazillions to spend on CPU R&D and a wealthy GPU company who primarily focuses on graphics, all while having a huge debt and not much money to spend.


  87. I think the closest analogue to the first-generation Maxwell in recent memory is the HD-7790. Both products are lower-midrange parts that act as testbeds for future technology. In AMD’s case it was trying out revisions to GCN and in Nvidia’s case it’s a slow-walk into the Maxwell architecture.

    In both cases, I think the architectural changes and design decisions were more interesting than the actual cards themselves.

  88. Galaxy has a low profile version. It’s still actively cooled though, but I doubt the fan makes any discernible noise.

  89. A barrel shifter is used to shift bits to the left or right. A neat trick is to use a barrel shifter to multiply or divide integers. For example lets say I have the number 6 in binary, to multiply it by 2 you just have to shift left once, or to divide you just shift right once

    00000110 <–6
    Shift left once
    00001100 <–12

    00000110 <–6
    Shift right once
    00000011 <–3


  90. I was going to point out page 11 is pretty much there just to make Nvidia look good, but I thought it wasn’t worthwhile.

    TR puts a lot of emphasis on efficiency, which is something I don’t think matters most of the time for desktops, but it’s a stat they choose to hold in very high regard. So you do get articles like this where they get squeely over a 30w reduction in power while playing games. Mobile devices are one thing, desktops are something completely different.

    I personally would be much more happy with that trade in efficiency being spent on speed, for desktops, but perhaps that’s just me. With the uprising of 120hz/144hz panels and ultra-refined gaming (those that play competitively, aspire to, or look for ever increasing fluidity), more frames will never be enough.

  91. A “minor” clockspeed bump on the GPU and RAM could easily account for most of that increase in power consumption…

  92. I do wonder about coin mining with these things. The numbers are very nice and everything, but I have no idea what a “barrel shifter” even [i<]does[/i<].

  93. Yeah, if I could get put on a list for when a half-height version comes out I would be so happy. My kid’s compact PC has a GT 430 or whatever (I think a 7750, which was $100 back in the day, would be taking the power supply to the limit, but price was a factor back then and isn’t now.) It does OK with Source games, but one of these cards in half-height would allow for the settings to be cranked.

  94. I don’t like crippled either, but I still bought a 460 two years ago. I also bought a crippled 760 this month and the jump was huge – more than double the performance. The full 770 was too expensive for not enough of a performance boost in my opinion.

  95. [quote<]Plus if you don't like what I say,ignore it. Simple as that.[/quote<] Fascinating: Posts egregiously long-winded rants complaining about what TR posted in a rather normal story about a rather pedestrian video card. The rants might be longer than the actual story at this point. Then... when someone calls out his absurdity.. his only advice is to ignore what he says. Apparently he is immune to his own advice since he clearly does not ignore what TR posted. Yup... smells like a completely inconsistent and potentially unhinged spigzone alias to me.

  96. (TL;DR all comments) These launch prices are likely artificially high (as they usually are from both manufacturers). Given the die-size efficiency of Maxwell, I bet Nvidia is making some healthy profit margins from the list price that they can tap into for future price cuts.

  97. To be fair, it was the 1GB SE model (256 bit memory, but fewer SPs), but if I’m going to be accurate, I should mention it was $110 after a $40 MIR. (And a darn near silent EVGA one no less.)

    Yeah, I occasionally saw some deals after rebates on the Boost, but none that good!

  98. I like how in 2001 that dinky heatsink/fan is described as “It looks tough.” In 2014 it looks the exact opposite.

  99. Thanks for taking the time to reply to my post.

    It just worried me that your figures indicated one thing,and what you said might be suggesting something else,but thx for the clarification.

    The GTX750TI is an efficient card,but considering Pitcairn Pro is nearly two years old,its not doing that badly. It is after all a bit faster overall and consumes a bit more power at the wall,which means it is not a slam dunk for the GTX750TI. Here in the UK,the R9 270 and GTX660(as low as £120 to £125) are the same price as the non-reference GTX750TI cards for example. The reference ones are slotting right into HD7850/R7 265 level pricing here(around £110 to £115).

    The problem is AMD has just essentially just rebadged most of their old GPUs to probably save money.

    However,since they are at a process node disadvantage for CPUs,they are probably are working on improving the efficiency of GCN,which you could argue is AMD’s Fermi step. Yet from Fermi came Kepler and Maxwell which are both derivatives.

    Plus I get the impression AMD is very tight lipped about their GPUs,I can remember,the HD4000 series launch years ago for example.

    I expect the same from AMD in light of what we see with the leaks of the Kaveri successor,Carrizo which is still 28NM but appears to have a reduced max TDP of 65W:

    [url<][/url<] That is from an AMD presentation. Kaveri maxed out at 95W. Since they are going with wider cores,it means there are only two areas power consumption can be reduced - chipset and GPU. We saw where the improved power consumption of Kaveri came from in many reviews - the IGP. I expect looking at the HD6970 to HD7970 transition,the R9 290X will be short lived, Both the HD6970 and R9 290X were released due to process node stagnation. Even then look at the R7 260X and the HD7790. Both the same GPUs,with the latter having only a minor 100MHZ GPU and RAM clockspeed bump,with Trueaudio enabled. But interestingly the HD7790 consumes far less power,being close to an HD7770 power consumption level,but being faster. The R7 260X jacks up power consumption massively it seems. It makes me wonder whether those DSP cores are not only adding die area but bumping up power consumption when active.

  100. [quote<]Plus if you don't like what I say,ignore it. Simple as that.[/quote<] [i<]That's[/i<] not how the interwebs work, you know thaaat.

  101. I am not what ever random member you are talking about BTW, and I am sure TR can find out from what country I am from. If the author wants to verify who I am then,I can PM them.

    Moreover,it seems just because someone does not agree with your view,you want to stiffle their ability to comment.

    Maybe,instead perhaps we need to see what aliases you are under then??

    Plus if you don’t like what I say,ignore it. Simple as that.

  102. Either they didn’t drop far enough or if it did, the deal sold out before I could act. The best deal I heard about was a $135 2GB 650 Ti Boost (after MIR of course). My friend got it, but I didn’t manage to snag it.

    $150 for the 460 is impressive… Mine was right after launch at launch MSRP $230. Of course, it’s holding up quite well at 1080p which makes it less urgent.

  103. My point with the link/graph is to help explain the question posed in the OP, why one would focus on efficiency for these reviews … Showing that the 750 ti is in a different race entirely.

    Yes the performance per dollar is disappointing, but is easier to fix than performance per watt.
    a) Price drop. It would need to be $120 to match the 650Ti-B ratio. $130 would be fair (IMO).
    b) Overclock. Results from other reviews are quite promising, getting to at least 650Ti-B performance with a negligible power consumption increase.
    [url<][/url<] [url<][/url<] (Of course you could argue you can OC the 650Ti-B...)

  104. Damage… I’m not sure that “new sign-ups” is an accurate description of Spigzone’s newest aliases.

    The real question should be: Is Spiggy actually aware of the fact that alientorni and GAMER4000 are both his accounts, or does he have split personalities that don’t know of each others existence?

  105. First of all, welcome to The Tech Report. It’s always nice to see new sign-ups contributing to the conversation.

    Second, I am the author of the article in question. I’m sorry it confused you. Let me explain some things.

    The 150W figure is from AMD. That’s the rated TDP of the R7 265. TDP is not a measure of typical power use while gaming, but a maximum that determines how PSUs, chassis, and cooling solutions must be built. TDP is therefore a significant factor in determining what solution to buy and install in a system.

    In the second statement, I am talking purely about AMD’s response to upcoming variants of the Maxwell architecture. As I said, we don’t know much about AMD’s roadmap going forward. Based on what we’ve seen from Maxwell, just at 28 nm, AMD is going to need an architectural refresh to stay competitive with Nvidia’s future Maxwell-based chips.

    That’s what I meant. It’s informed speculation. I don’t see why that should be perceived as “overexcited” or somehow offensive. Seems well-grounded, given everything. Here’s hoping AMD has something new and promising in the pipeline.

    Again, thanks for signing up and taking the time to post. I’m happy to clarify further if needed. Sorry if any language barrier or anything kept my meaning from being clear. One never knows, in my position, where exactly the audience is coming from.

  106. Has the author of the article even looked at his on figures in the summary??

    I am seriously confused by some of his statements,in light of looking at the review figures!

    ” The reference GTX 750 Ti is 5.75″ long, doesn’t need an auxiliary power input, and adds no more than 60W to a system’s cooling load. The R7 265 is over eight inches long, requires a six-pin power input, and draws up to 150W of juice, which it then converts into heat.”

    However,lets look at the power consumption in Crysis3:

    [url<][/url<] The difference between the GTX750TI 2GB and R7 265 is 35W at the wall. The worst thing is the HD7850 can run off a low power PSU anyway. All the systems are consuming under 250W at the wall with a socket 2011 based system. Anyone with a socket 1155 IB or 1150 Haswell system will be probably drawing under 200W at the wall. So where is all this 150W rubbish coming from?? The author seems a tad overexcited: " I'm not sure what AMD can do to answer other than drop prices. Heck, I don't think we know much of anything about the future Radeon roadmap. AMD seemingly just finished a refresh with the R7 and R9 series. Looks like they're going to need something more than another rehash of GCN in order to stay competitive in 2014. " What refresh?? They are all rebadged older cards - many of the R7 265 cards are old HD7850 ones with a new BIOS and sticker. So,what about all the other Nvidia cards then?? The GTX660,GTX760 and GTX770 - what now are they suddenly not worth buying? They are all old GPUs tarted up like what AMD did. Only the R9 290 series is new. Look at the time period between the HD6970 and HD7970?? One year and people should look at the dates of the TR reviews. The HD6970 was produced since 32NM was cancelled. The R9 290X was produced since 20NM is late. Its most likely the R9 290X will be replaced late next year,unless 20NM is uber expensive. Even then that would affect Nvidia too. The entire Nvidia product stack is still Kepler and if the author looked at the process roadmaps,most of the TSMC early 20NM production is going to Apple. PS: People can downrate this all they want,but it just shows they have not even read what I said or even the figures in the review. Just to repeat: [url<][/url<] Where is this 150W figure?? Can TR clarify why their review shows a 35W difference,and yet the reviewer pulls some random 150W number. Is it because the MAXIMUM power you can get from a single PCIE-E power connector and the slot is 150W,ie,75W each?? Did the author mix up board power and TDP? They are not the same. It only means the R7 265 can use upto 150W if required,which it NEVER does under any gaming condition. Because,I assume people realise,that is nothing to do with the actual power consumption?? After all if the HD7850 2GB/R7 265 was drawing 150W,then how come there was an OEM SLOT powered R7 265: [url<][/url<] Wow! That must be a magical PCB they use there,cutting power consumption in half!! It isn't,that is why. 😉

  107. The next page, performance per dollar, shows what’s disappointing. The card it replaced at that price of $150 gives 25% better performance-per-dollar, based on TPU’s tests. Yeesh.

  108. Wow, you two, at least I feel like I’m in good company now. I’m in the same boat — built my current system with a 460, and I’ve had my eye on a 2GB 650 Ti Boost or 660 for a while, but they never quite dropped far enough. And now they’re gone. I’m just not impressed (!) with the price/performance improvement in that time. Yeah, I can spend $150 for a 750 Ti, but it’s far too small of a bump versus the 460 I got for that price three years ago.

    Next best option was probably a 760, I suppose, but those haven’t dropped one cent from $250 since they launched last June, and I’m not that desperate yet. I guess I’ll just sit on my 460 until the situation improves, whether through some eventual price drops or more fleshing out of the Maxwell range…..

  109. Don’t you find yourselves giving too much explainations? You should listen to your readers more than just replying

  110. NVIDIA biased review/report #n
    It’s a relly nice GPU, don’t get me wrong, but here are some text lines that makes me sick.

  111. I’ve stuck with AMD graphics for the past ten years. Especially with the HD4000 series onwards, they seem to have offered a nice balance of price/performance/watt. And one of the reasons, apart from cost, why I’ve stuck with HDx6xx or HDx7xx cards is because they only require one or no 6-pin aux power. I have to say, however, that with Maxwell, I just might switch over to the green team if I find myself looking for a new graphics card any time soon.

  112. Very awesome – this was how I got into PC gaming. Had a $300 Dell and popped in an HD 5670 – went from playing Halo PC at 30fps to Crysis on High at 30fps.

  113. No, I just meant I knew eventually it would get to that level. I didn’t expect it to get there right away, but I also didn’t expect it to still not make it.

    I’ve wanted to upgrade my card for well over a year, and I was just waiting until I could get the performance I want at the price I want. The 460 is OK for now, I guess, but I’ve been ready for a big jump.

  114. Yeah, I’m just not a fan of that, other than to deal with GPU dies that can’t pass validation. Academic at this point anyway; Ti Boost cards are largely unavailable.

  115. [quote<]Edit: Just realized I don't know what you mean when you say we have "a slower card getting a recommendation because it uses less power." Did that happen?[/quote<]Actually, no, that part didn't happen; heh.

  116. The linked article from 2001 is worth a look! Power measurements are MIA, and it even references the OMM crate article. Blast from the past, indeed.

  117. I wrote the R7 260X review, so I kinda feel obligated to respond.

    I don’t think these two reviews are comparable. Yes, the R7 260X was more power-hungry than the 7790 under load by about 16W, but both cards are the same size, both cards require 6-pin PCIe power connectors, and the difference in noise levels can be chalked up to the nice, dual-fan Asus cooler of my 7790 sample. Fundamentally, the R7 260X and 7790 belong to the same class of product. They’re aimed at budget or mid-range enthusiast desktops with a single 6-pin PCIe power connector coming from the PSU.

    The GTX 750 Ti, as Scott explained in his review, doesn’t belong to that same class. It has a shorter circuit board, doesn’t require an auxiliary power connector, and draws at least 36W less under load (54W if you compare it to the GTX 650 Ti Boost) than its peers. As Scott said, the GTX 750 Ti can go in places that those other cards can’t, such as very small-form-factor systems or low-power Steam boxes. Considering that the card’s GM107 GPU was specifically architected for low power consumption, I think highlighting those facts in the review makes perfect sense—irrespective of whether they were played up in Nvidia’s presentation.

  118. There’s a reason why I immediately knew which 600-series card the 460 mapped to… I’m in the same boat with a 460 and have watched the prices of the 650 ti boost and 660 closely and was targeting the 2GB version of the former for $140.

    The 650 ti boost [u<]1 GB[/u<] rarely went below $150 (a couple $130 sales were sold out immediately and I missed out) and the 2 GB hovered around $170, it's simply wishful thinking to expect a 660 for $150 without these new cards "undercutting them" (OC'd that is). Now that the 750 ti is out, OC'd versions being like $10 more and close to 660 performance, those 660s might drop. Anyways, your timeline is unrealistic... If the 660 is 18 months old and had a $229 price tag and you've "been waiting for GTX 660 (vanilla) performance to drop down into the $150 range for [b<]well over a year now[/b<]" (emphasis mine) that means you expected the 660 to go from $229 to $150 in less than 6 months...

  119. A couple of points:

    First, on power, you’re right that the differences in measured system power between the R7 260X and 750 Ti aren’t that large and that they’d likely be smaller with a 7790, if we tested it the same way. Don’t forget, though, that we’re measuring typical power in a game, not peak power like you’d use for TDP–and we’re measuring full system power, not just board power.

    Our measurements provide easily the most relevant numbers for end users, but TDP matters because it dictates the engineering requirements for cooling, board power supply, system PSU, and cooler/enclosure volume. So you have to account for both. We tried to do that here.

    In the future, we may try measuring GPU power alone by probing each card, but that’s a lotta work. We may also use something like FurMark to test peak power draw, as well.

    Second, on the review format, this was the first truly new GPU architecture we’ve reviewed in two years, so the format and focus were a little different than your typical review of a new chip derivative. We also included efficiency scatter plots in our last review of a new GPU architecture:

    [url<][/url<] We didn't include them in anything earlier because we hadn't thought of them yet. 🙂 And yes, unabashedly, in architecture reviews in particular, if a chip architect tells me his team focused on one key metric in building a new architecture, we're absolutely going to look into it. In this case, it doesn't hurt that power is absolutely the #1 constraint in GPUs these days. See this, for instance: [url<][/url<] We are learning how much power consumption bubbles to the top over time. It's taken longer in GPUs than in CPUs, but here we are. Edit: Just realized I don't know what you mean when you say we have "a slower card getting a recommendation because it uses less power." Did that happen?

  120. I’ve seen quite a bit of chatter about large caches being beneficial in bandwidth limited scenarios. Intel IGPs have a 512KB L3, a connection to the main CPU L3 and on top of all that the optional eDRAM.

    I’m sure this Maxwell core is bandwidth limited in many situations.

    The latency improvements could also be a sign of cache performance improvement beyond a size increase, and/or memory controller changes.

  121. I see where you’re coming from, my gut reaction was the same. Once I saw that they added it to try and predict the performance of future Maxwell based cards it made sense.

    Here’s an analogy: BMW launches a new engine technology in their 3-series cars which make it extremely fuel efficient for it’s power output. Extrapolating from that you could predict what kind of performance you might see in their bigger 7-series or M series where fuel efficiency matters less and you can send a lot of fuel into that extremely efficient motor.

  122. I think it’s due to Nvidia’s claims that it was “2x more efficient per watt” (or “just as fast at half the power”), which needed to be investigated. When either AMD or Nvidia play up a new feature, that will be focused on during the review (and, if measurably important, subsequent hardware reviews).

    Not to mention, they qualified the recommendation by stating it was a good choice for space/power constrained systems.

  123. ‘Crippled’? Why, just because some functional in its are disabled to meet a price or performance point?

  124. Yeah, seems like it would. I’d just like to not buy a crippled GPU. Also, they seem to be drying up and 750Ti is not really fast enough to replace it.

  125. See, I think the reason for the different focus from one review to the next is because of how the cards are presented to TechReport by both Nvidia and AMD. This is nothing sinister on TechReport’s part, but it does show that TechReports evaluations are malleable in some ways. What I think is that TechReport should develop its own criteria for these evaluations and those criteria should not be subject to influence from outside forces.

    For instance, if power consumption is so important, it shouldn’t be important just because Nvidia decided to design a very power efficient card. Instead, what we have here is a slower card getting a recommendation because it uses less power. That’s pretty astonishing! In that light, with the launch of the R7 260X, TechReport should have said something like, “Do not buy this card! It is faster, yes, but it uses more power”.

  126. Did I say it’s one warp per SM instead of one per quad? Where is that in the text?

    The Maxwell SM should be able to ouput 1 warp/clk per quad, or 4 warp/clock per SM.

  127. The best thing about these Maxwell cards is that they over clock well. The stock version of the 750 TI is close if not the same as the performance of a 7850. This right here should blow our minds. A 128 bit 60W card with the same performance as a 256 bit 130W card. If you get factory over clocked cards like the PNY or Palit, you get about the same performance as the 265.

    Can’t wait to see what Nvidia does at 20 nm.

  128. True, there’s a bit of a bump, but my 460 vs a GTX 660 suddenly becomes something like a 50-80% boost in average frame rates depending on the game. That card launched 18 months ago at $229, I was hoping it’d get closer to $150 by now. Cheapest on Newegg and Amazon seems to be $190.

  129. Kepler was a complete reworking of Fermi, but they are still very similar. Nvidia themselves call Kepler “an enhanced Fermi.” The biggest difference between the two architectures is that Fermi actually has better compute capabilities and Kepler uses more, smaller cores that use less power. I imagine Maxwell is really just an enhanced Kepler with other tweaks here and there. Still, it’s obviously working for them in the gaming and mobile markets. I wonder if AMD will take the same route and remove the compute capabilities on their consumer cards in order to avoid another Fermi fiasco. Their Hawaii cards seem to be on that path.

  130. This is probably the best illustration of efficiency (going by perf/watt) out of all the reviews…
    [url<][/url<] The 260x efficiency is much in line with it's peers...

  131. Only a small part of endcoding/decoding depends on the bit rate and that portion is very easy to do in fixed function hardware that can be made to speed it up a great deal. The real heavy lifting comes further down the pipeline and has little to do with the bitrate of the video.

  132. Usually the bottleneck is encode, not decode. Their encoding engine is fairly fixed function and has few knobs that will change it’s performance or quality output so they can pretty confidently say, at x resolution we are y times faster than real time

  133. I’m more curious about what this product means for the near future than how it performs in this review right now to be honest, the prices may be high but this kind of efficiency could be interesting when it comes time to roll out the enthusiast class derivatives a few months later.

    Also curious what AMD has planned in response when that time comes, should make for exciting reading if they also have something up their sleeves.

  134. I agree, the price-performance metric is disappointing unless the use case is power constrained (HTPC or upgrading a PC with a poor PSU). Especially considering the hefty reduction in die size.

    Prices seem to have stagnated these last couple years. I hope we see some movement with the move to 20nm.

  135. Re: GTX 460 performance…

    A (non-boost) 650 Ti has more or less the same performance.
    Edit: Definitely on the more side, except for compute.
    [url<][/url<] Seeing as the normal 750 beats the 650 Ti, the 750 Ti is definitely giving you a jump over the 460.

  136. Truly disappointing in some ways. Price basically stagnant and actually less performance relative to GTX 650Ti Boost, and there appears to be a huge gap in performance between the 750Ti and the 760. I’ve been waiting for GTX 660 (vanilla) performance to drop down into the $150 range for well over a year now and I’m not really any closer now than I was when the card launched in 2012.

    Yeah, the power consumption is great, but if I’m not gaining performance on my now-ancient Fermi GTX 460, then I’m not really getting anywhere. Excited for a larger Maxwell part. Not very excited about the prices that nVidia’s going to charge, if this price/performance is any indicator.

    This can only mean good things for mobile graphics, though. Thinner, lighter notebooks with better graphics relative to GK107 parts.

  137. Those new efficiency graphs are really interesting. They really demonstrate how irrelevant specifications are compared to real world performance. At the same time you’ve used them masterfully to predict the performance of future Maxwell cards. It will be interesting to see how those predictions hold up and if this efficiency scales well across the product range.

  138. I think the biggest reason for the focus is it draws from only the pcie slot while actually being performant enough even in the heaviest games(beating the 260X with nearly half the TDP).

    The strongest card before this was the HD 7750 at half the frames.

    And no node change. If this was 20nm, it would be expected.

  139. All that and the fact that “Architecture Efficiency” was introduced with the release of “NVIDIA’s Geforce GTX 750 and 750Ti review” by There was no such chart in the Radeon HD 4870 review IIRC. or the 5870 review.

  140. Funny thing: I went back and looked at the R7 260X review. The R7 260X is an HD 7790 with nothing but a modest bump in core and memory clocks with an attending hike in power consumption (16 watts measured by TR, 30 watts by TDP). The HD 7790 would be closer to overlapping the GTX 750 on the power and performance plots. In light of that, It’s hard to know how to assess this review’s reception of the GTX 750 series cards vis-a-vis the reception of the R7 260X when it was first reviewed. The reviews pretty clearly prioritize power consumption differently from one review to the next – it hardly garners any attention at all in the R7 260X review. That’s not to say which is better, just noting a difference in scoring where there ideally would be none.

    With that out of the way, and looking at the GTX 750 series on its own, what Nvidia has done with power consumption is impressive.

  141. Because they have CEO with eggs, and he know what they produce. Can you imagine AMD CEO talking about technical details of Radeons or graphics ?

  142. Yes. My guess is that it helps with storing certain relatively small and critical pieces of data so that they don’t have to get transferred back & forth between the GPU and video RAM as often. Examples could include compiled shader programs that are relatively small and that often get used over & over again between frames. The cache would be acting more as an instruction cache than a data cache at that point.

    Maybe (big maybe) you could cache compressed vertex data for some models that get re-used in multiple frames of video output. Relatively speaking, the vertex data is small in comparison to giant textures.

  143. Ah, i see, very interesting. But then…how it helping them, bandwidth or latency-wise? 2MiB would be too small for a frame buffer as well, i suppose?

  144. There are lots of repetitive steps [b<]but[/b<] each individual chunk of data tends to pass through the GPU pipeline once and then get discarded until the pipeline starts over again. Think about rendering a 3D model: You process the vertex data, apply shader effects, rasterize, do post-rasterization processing and textures and then push the finished product out to a video display buffer. Then you start the whole process over again for the next frame. The sheer size of data sets that are used in modern graphics mean that even a very large GPU cache can't hold too much data between frames of video. GPUs definitely try to keep big memory consuming objects that are used repeatedly in the video RAM instead of trying to feed it over the PCIe bus over & over again. So there definitely is memory management and caring about locality, but it tends to be more in video RAM than in an on-die cache that would never be big enough to hold large sets of data in modern graphics setups.

  145. Interesting, really, I thought that graphics processing involves a lot of similar, repetitive steps at the low level…i mean, is that not what one would want, lots of cache hits?

  146. The power efficiency of Maxwell is extremely impressive and it’s not even on a new process node. The 20nm Maxwell GPUs designed for higher power draw are going to be incredible.

  147. That’s what i had thought, but wouldn’t transcoding time depend on the size of the file (which would ultimately depend on the bitrate, as bits/sec * sec = total bits)?

    I would think a low bitrate video would transcode faster, or is this decoding/encoding process independent of input file size?

  148. [quote<]2. Nvidia says x-times faster than "real time". What do they mean by "real-time"?[/quote<] They mean video playback time, so 6x faster means that every minute of video takes 10 seconds to encode, a 2 hour movie would take 20 minutes, etc.

  149. There is quite a bit of technical information.

    I think the biggest standout in the layout of Maxwell comes down to the very large (for a GPU) cache. GPUs have traditionally not done a whole lot of caching since they stream data from the memory in high-latency high-throughput workloads. Good examples are graphics (of course) and lots of number crunching numerical applications that don’t care about latency.

    The cache definitely helps with reducing latency [much as it does in CPUs], but from the writeup it also appears to be there as a power saving feature to reduce the requirements for heavy GDDR5 memory I/O. That’s a very interesting architectural change.

  150. Good to see the power efficiency numbers! That’s what makes these cards interestly since obviously these ain’t high-end parts.

    Honestly as of right now I’d get an R7-265 if 1. You aren’t obsessed with needing low power; 2. If the R7-265 is actually available at the stated price and not a coin-inflated price; and 3. You are using Windows.

    However, as a demonstrator of Maxwell’s potential future capabilities, I think these cards are very interesting and big Maxwell should be quite potent when it finally comes out [TSMC!]

    [Edit: I’m going to take it as a badge of honor that AMD fanboys are so stupid that they downthumbed this post even though I had favorable things to say about the R7-265]

  151. [quote<]The barrel shifter returns in Maxwell, and Nvidia claims the GM107 can mine digital currencies quite nicely, especially given its focus on power efficiency.[/quote<] Basically, that's easily the most negative thing I've heard about this card. I do *not* want price gouging on Maxwell like we are seeing with Radeons!

  152. I have never spent so long on the first page of any GPU/CPU review, now i have a dozen tabs open with block diagrams, whitepapers and older articles on Kepler and Fermi from TR, AnandTech and Real World Tech.

    Thanks, Scott! 😀

    edit: 2 questions:

    1. Why only one warp per SMM, and not one warp per vec32 unit (four per SMM)?

    2. Nvidia says x-times faster than “real time”. What do they mean by “real-time”?

  153. The MSRP of the 290X hasn’t changed, it’s not AMD charging $1000, it’s Newegg, Amazon etc that are making a killing.

  154. not a plug, PCper has benchmarks with low cost off the shelf PCs.
    the type that a relative would buy you, because you like computers. :/
    this card improved FPS from 8 to 9.5X iirc. older pc + 750Ti = 1080p
    gaming for people budget constrained.

    what a card.

  155. Given the 750 naming scheme I suspect that we won’t see another Maxwell GPU until the end of the year. Kind of like the Bonaire-Hawaii launch last year.

  156. Big Maxwell on 20 nm would be nice.
    Also, according to toms, Maxwell GPUs are a major improvement in crypto-mining area. I’m guessing that could alleviate some of AMD’s issues?

  157. I’m interested to see what AMD has to offer in their rumored “Pirate Islands” GPUs that are the successor to the Hawaii (or was it Volcanic?) GPUs.

  158. [url<][/url<] cudaminer benchmarks for ya - very interesting that the 750ti can just about match the R9 270 in KHash per watt. Thoroughly enjoyed this review... very interesting times ahead!

  159. I’ll be honest, this is the innovation in GPU’s that I have been wanting to see for ages. Ever since my first build with a X1900XTX had massive overheating problems, I’ve always found power efficient cards to be super sexy.

    Granted efficiency is always getting better with each generation, but it has been a long time since I’ve seen something like this debut. Maybe that has more to do with the fact that they released lower end cards first.

    Either way, I’m sold. Easily found a replacement for my 5850. For $150 bucks. Wow!

  160. I think the GTX 750 Ti would be a very appealing card in combination with a cut down AMD 35W Kaveri with the GPU disabled. If AMD sold it for $50, I would expect it would be much faster than the A10-7850K for a similar price and similar power usage.

    The AMD options are less interesting because they have higher power consumption.

  161. it is interesting to see the new driver used on the bench for the geforce gtx 650 ti boost. It performs more strongly with it. In the other reviews that are around the 750 ti is able to best it to a fair degree, so that much performance is attributable to driver improvements.

    It is also quite interesting to note that disabling the the chip ever so slightly in the 750 seems to trip it up a bit with regards to bf4. I wonder if that is indicative that the new arch is highly integrated and brooks no alteration without a tradeoff to match it. wonder if that jumpy performance difference will persist in higher versions.

    this architecture is a big step forward in the efficiency direction no joke. but big gains don’t just magically show up. if you design to move forward in one direction then you stand to move away from another. that’s just common sense. I wonder then, if nvidia has lead with this card because it puts their new architecture in a better light? perhaps maxwell is mobile centric. nvidia’s haswell perhaps? maybe then, nvidia is pulling the same thing AMD did with the a8 7600. maybe the performance increases don’t scale well up to the higher end. perhaps the higher level cards won’t be as impressive against their preceding competition as these things. they do have that higher memory bandwidth though. can’t see that not being impressive with a wider bus. and some of these maxwell cards are clocked really high while maintaining relatively low power usage. I’m interested in the higher maxwells now, though and which way that goes to lean. perhaps GCN, going forward, will remain the more desktop focused arch. doesn’t really matter though if maxwell is just better. a lot of interesting things here.

    as for Origins, I don’t think it’s indicative of a problem with AMD that they haven’t gotten it work just yet. it’s been designed against them. there are a handful of games like that. don’t you just love it when they interfere with the games themselves?

    good stuff good review.

  162. I would say laptop. They might want to have the notebook version of this chip in laptops very soon while in desktop it is replacing products based on bigger dies. In other segments Maxwell doesn’t offer much of on upside on 28nm but in laptop power matters so chances are the GeForce GTX 860M and w/e else they release based on GM107 should do well.until 20nm arrives.

  163. Since what really matters is what will arrive on 20nm, it is rather exciting.
    Seems that 15 SMX on 20nm might be close to 200mm2 and match the vanilla Titan in perf . Next year 3 SMX in Tegra might be reasonable enough.(and maybe mobile Volta can match the Xbox One a bit further down the road) Guess we’ll have to wait and see how good the new process actually is.
    Low RAM needs might also be a rather good thing , 2-3 less GB for higher end cards do enable a lower retail price.
    The low power also seems to enable nice clocks so all in all plenty of things to like here.

    Would be cool if AMD can compete, at least in perf per die size and i do hope they don’t delay the transition to 20nm too much for lower costs.

  164. “Nvidia claims the GM107 can mine digital currencies quite nicely, especially given its focus on power efficiency.”

    I think that might actually help out AMD by helping them lower the price of the 290X down from nearly $1000… Unless if Nividia anticipated the crpyto-mining boom and ordered a large amount of GPU dies to take advantage of it.

    TR, you got another benchmark to do. Bit/lite-coin the Maxwell GPU.

  165. What was Nividia’s reasoning for introducing Maxwell in a mid-range GPU?

    Are they trying to clear out the older Kelpar GPUs such as the 780 TI and the recently introduced Titan black edition before introducing Maxwell to the upper end?

  166. I hope someone produces one of these bad boys in a low form factor silent form. That would be wonderful!