A brand-new x86 processor microarchitecture doesn't come along every day, but today, we have just that. We visited the offices of VIA Technologies' processor subsidiary, Centaur Technology, Inc., yesterday to learn about its new x86-compatible processor architecture. Remarkably, Centaur President Glenn Henry and his team of less than 100 people have created a thoroughly modern x86 processor microarchitecture from scratch over the course of the last four years. The resulting design, which bore the code-named CN during its development and is also known as the VIA Isaiah microarchitecture, is a superscalar 64-bit processor with speculative, out-of-order execution.
As a new design, Isaiah's overall set of capabilities and features reads more like Intel's Core microarchitecture than anything else, but Centaur has aimed Isaiah at the same set of targets its C7 and prior CPUs have sought. That means low power consumption and low cost come first, with performance taking a back seat. Yet by moving from the C7's simple, in-order architecture to a brand-new core with speculative, out-of-order execution, Centaur has the potential to deliver quite a bit more performance within its chosen set of constraints. Henry says the firm set the goal of delivering twice the C7's performance at the same clock speed and within the same power envelope. VIA now claims Isaiah has two to four times the C7's per-clock throughput, depending on the application.
That order of performance gainachieved by what Henry described as "real man's architecture" rather than process technology optimizations or a move to multiple coresmay sound promising, but Centaur is careful about setting expectations for the performance of Isaiah-based processors. The first implementations will be single-core chips topping out at around 2GHz, mainly intended for embedded applications, ultra-mobile PCs, very low cost desktop PCs, and so-called mobile Internet devices. Isaiah's mission is to add new capabilities and new instruction setslike x86-64, SSE3 and virtualization extensions compatible with Intel's VM provisionsin order to enable such devices to run newer applications competently.
Centaur isn't shy about taking on the competition in the markets it serves, though. In our meeting, Henry stated flatly that, based on what little we know about it, Intel's Silverthorne processor won't be as fast as Isaiah since it's an in-order design. He observed with seeming bemusement that Intel was developing an in-order architecture for this market just as Centaur was moving to an out-of-order design.
Regardless of its target market, the Isaiah microarchitecture's feature set impresses. Henry has authored a reasonably accessible architecture brief on Isaiah, even (quite comically) mapping Isaiah microarchitectural features to Intel marketing names like "Wide Dynamic Execution" in a series of footnotes. This brief reveals almost everything VIA has chosen to disclose about Isaiah to date, and I'd encourage reading it if you want the full scoop on this architecture. We will, however, discuss some of the highlights of the design. Here's a quick logical overview of Isaiah:
From this altitude, Isaiah looks very much like any modern x86 processor. Isaiah can decode three x86 instructions per cycle, which it translates into micro-ops for internal execution. Like Intel's Core, Isaiah can fuse multiple micro-ops into one, and it can combine multiple x86 instructions (like a compare-jump pair) into a single micro-op.
The chip can then issue as many as seven micro-ops per cycle to its seven execution ports. Those ports include two ALU ports for integer math, what Centaur calls a store address port and a store data port, one load port, and two media ports. The first media port is 128 bits wide and handles floating-point addition, SIMD integer operations, and divides and square roots. The second media port handles both integer and floating-point multiplication. According to Henry, "single-precision multiplies are fully pipelined with a world-record latency of three clocks." Another interesting touch: this unit has a combined multiply-add capability used by more complex x86 instructions like transcendentals that are handled via Isaiah's microcode subsystem.
Micro-ops are executed out of program order, and they're then retired in program order at a rate of up to three per clock, which equates to as many as three x86 instructions.
Isaiah also features new and distinctive multi-stage branch prediction logic and a "memory disambiguation" capability similar to the Core microarchitecture that can moves loads ahead of stores if there are no dependencies.
Isaiah's cache subsystem looks to make the most of its likely modest overall size with uncommon smarts. Isaiah's L1 instruction and data caches are both 64KB in size and 16-way set associative. The L2 cache is similarly 16-way associative, and initial implementations will be 1MB in size, although Henry indicated that different L2 sizes are a distinct possibility. The L2 cache is exclusive, so it doesn't duplicate the content of the L1 caches, effectively raising the total capacity of the cache hierarchy. Like other current CPUs, Isaiah uses predictive algorithms to examine data access patterns and prefetch some data directly into its L1 cache, but uniquely, it prefetches data less likely to be used into a smaller, dedicated buffer rather than the L2 cache.
Centaur sounds positive about the performance prospects of the first Isaiah-based chips, but Henry said they're "infinitely smarter" as a result of having the first real chips back from the fab. The team is now able to watch what's happening inside the chip as it operates in a way they could not during simulation, allowing them to make smarter decisions about things like queue depths and buffer sizing. Centaur is currently working on tuning the architecture and expects to achieve performance gains in future Isaiah-based products. New architectures do tend to afford opportunities for optimization; even Intel saw some nice performance gains when moving from Merom to Penryn.