At the Fusion Developer Summit last June, AMD CTO Mark Papermaster teased Kaveri, AMD's next-generation APU due later this year. Among other things, Papermaster revealed that Kaveri will be based on the Steamroller architecture and that it will be the first AMD APU with fully shared memory.
Last week, AMD shed some more light on Kaveri's uniform memory architecture, which now has a snazzy marketing name: heterogeneous uniform memory access, or hUMA for short.
Current APUs have non-uniform memory access (NUMA) between the processor and graphics logic. In those solutions, the CPU cores and IGP are both tied to system memory, but they each have their own separate memory pools. The processor cores must jump through hoops to access memory being used by the graphics hardware, and vice versa. Different heaps and different address spaces are involved, and when data needs to be shared, it has to be copied back and forth between the CPU and IGP pools. There is, as you'd expect, a performance cost to all those intermediate steps.
In Kaveri, hUMA takes away the hoops: the processor cores and integrated graphics have a shared address space, and they share both physical and virtual memory. Also, data is kept coherent between the CPU and IGP caches, so there are no cycles lost to synchronization like in current, NUMA-based solutions. All of this should translate into higher performance (and lower power utilization) in general-purpose GPU compute applications. Those applications tap into both the CPU cores and the IGP shaders and must pass data back and forth between them, which would require extra steps without hUMA. AMD said Kaveri's hUMA architecture has been implemented entirely in hardware, so it should support any operating systems and programming models. Virtualization is supported, as well.
Will hUMA mean CPUs and discrete GPUs can share a unified pool of memory, too? Not quite. When the question came up during the briefing, AMD said hUMA "doesn't directly unify those pools, but it does put them in a unified address space." The company then stressed that bandwidth won't be consistent between the CPU and discrete GPU memory pools—that is, GDDR5 graphics memory will be quicker, while DDR3 system memory will lag behind, so some hoop-jumping will still be required. (As an interesting side note, AMD then added that "people will be able to build APUs with either type of memory [DDR or GDDR] and then share any type of memory between the different processing cores on the APU.")
Kaveri is due out in the second half of 2013—likely late in the year, judging by AMD's latest processor roadmap. In addition to Steamroller CPU cores and hUMA, Kaveri will feature integrated graphics based on the same Graphics Core Next architecture as current Radeon HD 7000-series GPUs. Also, the chip will be fabbed on a 28-nm process, finer than the 32-nm process used to manufacture today's A-series APUs (a.k.a. Trinity and Richland).