Single page Print

Making it work in hardware
While it may not be possible to disable an Optimus system's discrete GPU completely, the scheme's routing layer otherwise offers automatic power on demand. Intelligently activating the discrete GPU is a markedly different approach from requiring users to manage their systems' switchable graphics state, one that should allow mainstream user to reap the benefits of a switchable GPU without even knowing that they have one.

To make Optimus switching truly seamless, additional modifications were necessary on the hardware front. About the only thing that hasn't changed here is the no-power state that the discrete GPU slips into when it's not in use. As with earlier switchable graphics setups, an Optimus GPU can be shut down completely. Awakening the GPU from this state takes less than half a second, and that's the only waiting a user will have to do—the few seconds of screen flickering that accompanied each graphics switch with previous switchable implementations has been completely banished.

Optimus is able to avoid flickering because even when the discrete GPU is handling a graphics workload, the IGP is still being used as the display controller. In Optimus configs, the discrete GPU is only linked to the system over PCI Express; its display controller isn't connected to anything. To get frames generated on the discrete GPU onto a display, the contents of the GPU's frame buffer, which reside in dedicated video memory, are copied to the IGP frame buffer that sits in system RAM. This "mem2mem" transfer would typically be done by the 3D engine, according to Nvidia. Unfortunately, the engine would have to stall during the transfer to maintain coherency—a hiccup that could cost two to three seconds of latency.

Nvidia obviously doesn't want to contend with that sort of delay, so current GeForce 200- and 300-series notebook GPUs, plus future Fermi-based and "netbook" graphics chips, all feature a built-in copy engine. This engine uses dedicated logic to feed frames from the discrete GPU's frame buffer to system memory, and it can function without interrupting the graphics core. Nvidia claims the copious upstream bandwidth available with PCI Express is more than sufficient to allow the copy engine to perform those transfers with just two tenths of a frame of latency—whatever that means. I'm sure some hardcore gamers will insist that they can feel lag, but the effects were imperceptible to me.

Optimus block diagram. Source: Nvidia

With an Optimus system's IGP serving as the display controller, there's no need for notebook makers to use multiplexers or to run two sets of display traces. In theory, Opmtius motherboards should be cheaper to produce than boards for past hybrid scehemes, because manufacturers won't have to pay for the muxes or find board real estate for them. According to Asus, dropping the number of onboard components and the heat that the muxes would normally produce will make it easier to squeeze Optimus into thin-and-light systems.