Jon Stokes over at Ars Technica has done a fine job of explaining what makes the Cell processor so unique. The Cell, formally announced yesterday by partners IBM, Sony, and Toshiba, is the much-hyped wunderchip that will power the Playstation 3 (alongside an NVIDIA GPU). Mr. Stokes illuminates the Cell's unique attributes, which include eight specialized SIMD execution cores coupled to a single 64-bit PowerPC core:
To sum up, IBM has sort of reapplied the RISC approach of throwing control logic overboard in exchange for a wider execution core and a larger storage area that's situated closer to the execution core. The difference is that instead of the compiler taking up the slack (as in RISC), a combination of the compiler, the programmer, some very smart scheduling software, and a general-purpose CPU doing the kind of scheduling and resource allocation work that the control logic used to do.I don't want to make too much of this, because there's much to learn yet about Cell and there are some real differences here, but those eight SIMD execution units sure remind me of the guts of a modern GPU, which is also a parallel SIMD machine. (I've been on this GPU-CPU collision course kick for a while now.) Here's how Stokes describes a Cell SIMD execution core, or SPE:
The actual architecture of the Cell SPE is a dual-issue, statically scheduled SIMD processor with a large local storage (LS) area. In this respect, the individual SPUs are like very simple, PowerPC 601-era processors.I believe NVIDIA's NV4x pixel shader unit is also a dual-issue SIMD machine that operates on 128-bit vectors of four 32-bit elements, also known as pixels (or fragments). Like I said, there are no doubt real differences between the two types of processing units but the similarity is substantialand GPUs are becoming more and more general in their computational capabilities.
The main differences between an individual SPE and an early RISC machine are twfold. First, and most obvious, is the fact that the Cell SPE is geared for single-precision SIMD computation. Most of its arithmetic instructions operate on 128-bit vectors of four 32-bit elements. So the execution core is packed with vector ALUs, instead of the traditional fixed-point ALUs.
Adding additional SIMD processing power a la Cell looks to me like a better means of capitalizing on growing transistor counts than the more conventional multi-core CPU approach that Intel and AMD are taking. At the very least, it seems obviously better once you get past two conventional CPU cores, when diminishing returns on additional general-purpose cores will become a problem. That's the trouble with the approach outlined in the recent Microsoft patent app: the second, third, and fourth CPUs are being asked to do work that might be better handled by a parallel SIMD machine.
|Corsair reveals its prize haul for the TR BBQ XIV||1|
|Portions of the Windows Shared Source Kit leak out||10|
|Hyper-Threading erratum rears its head in Skylake and Kaby Lake||32|
|VR180 video bridges the gap between YouTube and VR||4|
|Steam 2017 Summer Sale, part deux||18|
|Silverstone's Strider Titanium PSUs are ready for a high-power future||14|
|Deals of the week: Z270 mobos, spinning storage, and more||4|
|G.Skill readies up for X299 with quad-channel DDR4 at 4200 MT/s||16|
|Asus' VivoBook S510 is an ultrabook for the budget crowd||16|
|That's nothing compared to the ongoing espionage campaign that has been leaking the entire Linux kernel source code on a daily basis for literally DEC...||+26|