On the question of high-performance CPUs
All of the talk about not pushing "the bleeding edge" on process tech and not trying to eke out the last few bits of performance—along with the lack of emphasis on traditional high-end desktop and server CPUs—left us wondering about AMD's intentions for its x86 processors. It's one thing to deemphasize an area where competing is difficult and quite another to quit contending there. Several of us asked questions related to this topic in an attempt to gauge AMD's commitment to pushing forward on x86 performance.
The Bulldozer microarchitecture, obviously, has some performance issues. Worryingly, AMD's message at the time of the FX processor's introduction was that future Bulldozer-based processors would see 10-15% performance gains each year, starting with Piledriver. Given where Bulldozer has started, that plan now looks like a recipe for failure, in light of Intel's recent trajectory. Encouragingly, when asked about Bulldozer's prospects, Papermaster pulled out the 10-15% estimate without being prompted and disputed it: "We need more than that. We'll get more." Also, although the 2012-2013 products are "in delivery mode," he hinted at the possibility of a new socket in the next generation of server CPUs.
One of Bulldozer's big weaknesses right now is its performance in individual threads. The CPU does relatively well on some broadly multithreaded workloads, but its IPC (and thus performance) in each thread is often relatively poor. David Kanter attempted to tease out the new CTO's thoughts on single-threaded performance by asking what sort of gap with Intel is acceptable. 15%? 30%? More? Papermaster wasn't willing to give us a number, but to our relief, he didn't attempt to argue that single-threaded performance is unimportant, like some of AMD's marketing folks have been doing. Instead, he said he refused to give a number because he "didn't want to cede anything" to his development team. In other words, it looks like AMD will continue to push ahead on this front, even with the change in business strategies.
In the wake of the FX processors' release, some folks began speculating about whether AMD had abandoned its traditional use of custom logic design for broader use of logic synthesis. The speculation was fueled by Bulldozer's apparent inefficiencies and especially by the massively inflated Bulldozer transistor count AMD supplied to the press. This question is also something of a hot topic because, to one degree or another, most semiconductor companies are employing logic synthesis more extensively over time. When asked for his take, Papermaster acknowledged that AMD's x86 CPU cores (outside of Brazos) have been "highly customized up to this point" and said that, historically, there has been a "huge gap" between custom and synthesized logic. However, he asserted that the gap has "come down significantly" and that a new emphasis on synthesis will begin affecting AMD's roadmap in 2014.
One final tool that AMD will use to maximize its potential as a supplier of both CPUs and GPUs—and of products with both elements on a single chip—is something it calls the Heterogeneous System Architecture, or HSA. HSA has replaced "Fusion" in AMD's lexicon, but happily, it's a much more specific thing, with real technology behind it and a vision for realizing the potential of APUs.
Fundamentally, HSA is a software development target platform intended to allow applications to take advantage of both CPU and GPU computing resources on "converged" chips like AMD's APUs. HSA has several components, including a virtual ISA, a memory model, and a system specification. The ISA, known as HSAIL, is conceptually similar to the PTX ISA in Nvidia's CUDA, which provides a stable, fairly low-level programming target that still allows major changes in GPU architectures over time. HSAIL instructions will be translated into true machine code by a just-in-time compiler provided by the hardware vendor. Unlike CUDA, though, the HSA memory model and system specification will take into account the capabilities of APUs and other SoCs whose CPUs and GPUs can share the same memory.
HSA differs from familiar names like OpenCL and C++ AMP because it is a lower-level platform, even a possible compile target for apps written in OpenCL. As we understand it, HSA's goal is to make a common virtual machine with easy access to all available computing resources. AMD expects it to be programmed just like current SMP systems, with "seamless" access to CPU and GPU execution resources using the same basic syntax. The abstraction layer should handle the details of what gets processed where, bringing CPU and GPU computing resources to bear on the data as appropriate.
Crucially, HSA will be ISA agnostic not just for the GPU, but for the CPU, as well—so an application written for HSA could run just as well on an x86/Radeon combination like Trinity as on, say, an ARM/Imagination Tech combination in a tablet.
AMD hopes to turn HSA into an open, industry-wide standard. To that end, the company has established a foundation much like the ones that govern other standards, and it has invited other hardware, software, and OS developers to join. So far, the firm says it's hearing good things from its customers, but we're not aware of any companies that have joined yet. Assessing the prospects for such an effort is notoriously difficult, but if it somehow takes off, HSA could become an incredibly important standard, perhaps the first to allow CPU-GPU convergence to begin realizing its potential in consumer applications—while undermining a host of competing standards, everything from CUDA to x86. If not, well, as HSA point man Manju Hegde explained to us, it could still be a useful tool for enabling development on AMD platforms.
AMD has published a roadmap for HSA, which is interesting because it suggests what capabilities will make it into future APU generations. The addition in 2012 of "bi-directional power management between CPU and GPU," for instance, should be a Trinity feature. Looks like AMD will be progressively exposing features as it incorporates them into its APU hardware over time. Also, notice that HSA won't be extended to support discrete GPUs until 2014. Initially, this effort is very much about taking advantage of APUs and the ease of programming made possible by shared memory in APUs and other SoCs.
78 comments — Last by tcubed at 3:49 PM on 02/25/12
|Radeon Software Adrenalin Edition: an overviewA rose by any other name||28|
|AMD's Ryzen 5 2500U APU reviewedToward a more perfect fusion||166|
|Intel's Core i5-8250U CPU reviewedKaby Lake Refresh rides in on Acer's Swift 3||113|
|Nvidia's GeForce GTX 1070 Ti graphics card reviewedAnything you can do, I can do better||135|
|AMD's Ryzen 7 2700U and Ryzen 5 2500U APUs revealedInfinity Fabric ties Zen and Vega together||175|
|Intel's Core i7-8700K CPU reviewedSix shots of Coffee Lake, please||369|
|Intel's Core i9-7980XE and Core i9-7960X CPUs reviewedDid somebody say more cores?||176|
|The Tech Report System Guide: September 2017 editionHog heaven at the high end||100|
|Aerocool's Project 7 P7-C1 Pro case reviewed||6|
|Google Project Tango is dead—long live ARCore||5|
|Thermaltake Sync box bridges RGB LED walled gardens||3|
|Intel tips off potential 960 GB and 1.5 TB Optane SSD 900Ps||6|
|Sapphire Nitro+ Radeon RX Vegas put a big chill on spicy-hot chips||17|
|Antec P110 Silent touts quiet looks and quiet operation||11|
|Updated LG Gram laptops put heavy-duty power into feathery bodies||16|
|Monkey Day Shortbread||12|
|Thursday deals: a nice Z370 mobo, a huge VA display, and more||6|
|His comment looks silly now that AMD has gone back and time to release a 16-core version of the Threadripper.||+21|