On the question of high-performance CPUs
All of the talk about not pushing "the bleeding edge" on process tech and not trying to eke out the last few bits of performance—along with the lack of emphasis on traditional high-end desktop and server CPUs—left us wondering about AMD's intentions for its x86 processors. It's one thing to deemphasize an area where competing is difficult and quite another to quit contending there. Several of us asked questions related to this topic in an attempt to gauge AMD's commitment to pushing forward on x86 performance.
The Bulldozer microarchitecture, obviously, has some performance issues. Worryingly, AMD's message at the time of the FX processor's introduction was that future Bulldozer-based processors would see 10-15% performance gains each year, starting with Piledriver. Given where Bulldozer has started, that plan now looks like a recipe for failure, in light of Intel's recent trajectory. Encouragingly, when asked about Bulldozer's prospects, Papermaster pulled out the 10-15% estimate without being prompted and disputed it: "We need more than that. We'll get more." Also, although the 2012-2013 products are "in delivery mode," he hinted at the possibility of a new socket in the next generation of server CPUs.
One of Bulldozer's big weaknesses right now is its performance in individual threads. The CPU does relatively well on some broadly multithreaded workloads, but its IPC (and thus performance) in each thread is often relatively poor. David Kanter attempted to tease out the new CTO's thoughts on single-threaded performance by asking what sort of gap with Intel is acceptable. 15%? 30%? More? Papermaster wasn't willing to give us a number, but to our relief, he didn't attempt to argue that single-threaded performance is unimportant, like some of AMD's marketing folks have been doing. Instead, he said he refused to give a number because he "didn't want to cede anything" to his development team. In other words, it looks like AMD will continue to push ahead on this front, even with the change in business strategies.
In the wake of the FX processors' release, some folks began speculating about whether AMD had abandoned its traditional use of custom logic design for broader use of logic synthesis. The speculation was fueled by Bulldozer's apparent inefficiencies and especially by the massively inflated Bulldozer transistor count AMD supplied to the press. This question is also something of a hot topic because, to one degree or another, most semiconductor companies are employing logic synthesis more extensively over time. When asked for his take, Papermaster acknowledged that AMD's x86 CPU cores (outside of Brazos) have been "highly customized up to this point" and said that, historically, there has been a "huge gap" between custom and synthesized logic. However, he asserted that the gap has "come down significantly" and that a new emphasis on synthesis will begin affecting AMD's roadmap in 2014.
One final tool that AMD will use to maximize its potential as a supplier of both CPUs and GPUs—and of products with both elements on a single chip—is something it calls the Heterogeneous System Architecture, or HSA. HSA has replaced "Fusion" in AMD's lexicon, but happily, it's a much more specific thing, with real technology behind it and a vision for realizing the potential of APUs.
Fundamentally, HSA is a software development target platform intended to allow applications to take advantage of both CPU and GPU computing resources on "converged" chips like AMD's APUs. HSA has several components, including a virtual ISA, a memory model, and a system specification. The ISA, known as HSAIL, is conceptually similar to the PTX ISA in Nvidia's CUDA, which provides a stable, fairly low-level programming target that still allows major changes in GPU architectures over time. HSAIL instructions will be translated into true machine code by a just-in-time compiler provided by the hardware vendor. Unlike CUDA, though, the HSA memory model and system specification will take into account the capabilities of APUs and other SoCs whose CPUs and GPUs can share the same memory.
HSA differs from familiar names like OpenCL and C++ AMP because it is a lower-level platform, even a possible compile target for apps written in OpenCL. As we understand it, HSA's goal is to make a common virtual machine with easy access to all available computing resources. AMD expects it to be programmed just like current SMP systems, with "seamless" access to CPU and GPU execution resources using the same basic syntax. The abstraction layer should handle the details of what gets processed where, bringing CPU and GPU computing resources to bear on the data as appropriate.
Crucially, HSA will be ISA agnostic not just for the GPU, but for the CPU, as well—so an application written for HSA could run just as well on an x86/Radeon combination like Trinity as on, say, an ARM/Imagination Tech combination in a tablet.
AMD hopes to turn HSA into an open, industry-wide standard. To that end, the company has established a foundation much like the ones that govern other standards, and it has invited other hardware, software, and OS developers to join. So far, the firm says it's hearing good things from its customers, but we're not aware of any companies that have joined yet. Assessing the prospects for such an effort is notoriously difficult, but if it somehow takes off, HSA could become an incredibly important standard, perhaps the first to allow CPU-GPU convergence to begin realizing its potential in consumer applications—while undermining a host of competing standards, everything from CUDA to x86. If not, well, as HSA point man Manju Hegde explained to us, it could still be a useful tool for enabling development on AMD platforms.
AMD has published a roadmap for HSA, which is interesting because it suggests what capabilities will make it into future APU generations. The addition in 2012 of "bi-directional power management between CPU and GPU," for instance, should be a Trinity feature. Looks like AMD will be progressively exposing features as it incorporates them into its APU hardware over time. Also, notice that HSA won't be extended to support discrete GPUs until 2014. Initially, this effort is very much about taking advantage of APUs and the ease of programming made possible by shared memory in APUs and other SoCs.
78 comments — Last by tcubed at 3:49 PM on 02/25/12
|The next Atom: Intel's Silvermont architecture revealedAll-new architecture shoots for superior single-threaded performance||147|
|AMD's Radeon HD 7990 graphics card reviewedHow much does adding a second GPU really help?||178|
|Today's mid-range graphics cards in BioShock InfiniteAMD and Nvidia fight it out in Columbia||77|
|AMD touts unified gaming strategyGCN and x86 everywhere||79|
|Inside the second with Nvidia's frame capture toolsDisplay-level reckoning for GPUs||189|
|Nvidia's GeForce GTX 650 Ti Boost graphics card reviewedHas the Radeon HD 7790 met its match?||120|
|AMD's Radeon HD 7790 graphics card reviewedOld ingredients, new recipe||140|
|Nvidia's GeForce GTX Titan reviewedThe GK110 brings its talents to the desktop||220|
|Toshiba to start producing second-gen 19-nm NAND this month||17|
|Fractal Design lists Haswell-compatible PSUs||17|
|Mirasol lives, 1.5-inch display is coming 'soon'||15|
|Microsoft reveals next-generation Xbox One console||286|
|Intel dominates microprocessor revenue, AMD falls behind SoC makers||61|
|New Shadow Warrior game teased in video, pictures||14|
|HGST packs 1.5TB into 9.5-mm, three-platter Travelstar 5K1500 notebook drive||14|