Switchable graphics may not be one of the most exciting new technologies to hit the notebook space in the last few years, but for PC enthusiasts, I think it’s easily one of the most important. Enthusiasts are a notoriously demanding lot, you see. We crave performance, and most of us are gamers who simply can’t get by with a weak-sauce integrated graphics processor, especially if it’s an Intel Graphics Media Accelerator. At the same time, we want our notebooks to be thin and light and offer exceptional battery liferequirements that generally favor integrated graphics solutions, and GMAs in particular. Following tech writing’s tradition of lazy automotive analogies, that’s sort of like asking for the power of at least a turbo-charged V6 inside a car that offers Prius-like fuel economy.
Interestingly, the solution to this dilemma resides within the Prius itselfspecifically, the hybrid nature of its drivetrain, which combats noxious emissions with a 60kW electric motor complemented by a 1.8-liter, four-cylinder gasoline engine. When cruising around town at relatively low speeds, the Prius gets by on its clean-running electric motor. Bury the gas pedal, and the secondary engine springs into action, armed with enough horsepower to bring Toyota’s eco-wagon up to highway speeds.
Emissions aren’t an issue for notebooks, but the hybrid approach can be applied to conserve battery life. Intel’s Graphics Media Accelerators are plenty capable of handling basic desktop tasks, web surfing, and even video playback, all while sipping battery power sparingly. They’re the perfect engines for puttering around town. A considerably more potent discrete GPU must roar to life when users demand 3D performance, though. There are plenty of discrete GPUs from which to choose, starting with modest solutions more akin to that turbo-charged V6 and reaching all the way up to obscenely powerful feats of engineering like the thousand-horsepower W16 that rumbles inside the Bugatti Veyron. These discrete GPUs may draw considerably more power than an Intel IGP, but if they’re only called upon when needed, battery life will only suffer when there’s good reason.
The first stab at hybrid notebook graphics came in the form of Sony’s Vaio SZ-110B, whose primary graphics processor could be controlled using a hardware switch just above the keyboard. A reboot was required to complete the switch from the system’s integrated GMA 950 to the discrete GeForce Go 7400, so graphics horsepower wasn’t exactly available on demand. Still, the Vaio provided the market with its first taste of switchable graphicsand with proof that good battery life and competent graphics performance could coexist in a thin-and-light notebook.
Fortunately, switchable graphics’ second coming proved far more common and easier to use. Rather than relying on physical switches, recent implementations are capable of changing the primary graphics adapter via software. Rebooting isn’t required when switching from discrete to integrated or vice versa, although you will have to endure a few seconds of screen flickering as display duties are migrated from one adapter to the other. You’ll also have to close any so-called blocking applications that are tying up the graphics adapter with DirectX calls.
This contemporary switchable setup comes much closer to delivering power on demand, but only after a pause and with a few strings attached. One of those strings: the user must know that reserve power is lying in wait. Amazingly, Nvidia claims, a lot of folks who buy switchable notebooks don’t have a clue. Among those that do, few can be bothered to switch manually. Nvidia tells us it conducted a survey of 10,000 owners of notebooks sporting switchable graphics and found that only 1% actively switched back and forth between graphics adapters. Nvidia points out that those surveyed were predominantly mainstream users rather than PC enthusiasts, so it seems likely that few had any exposure to switchable graphics outside of a likely misleading sales pitch from a pimply teenager at their local Best Buy. Nevertheless, the fact remains that current switchable graphics implementations aren’t truly seamless. Ideally, a hybrid graphics subsystem should deliver power on demand automatically and without pause or restriction, which is what Nvidia claims it’s achieved with its next generation of switchable graphics, dubbed Optimus.
Switchable graphics today
To understand what makes Optimus unique, we have to dig a little deeper into how current switchable graphics implementations work. On the hardware front, such systems are equipped with an integrated graphics processor (or IGP) in the chipset or CPU, along with a discrete GPU. The GPU hooks into the system via PCI Express, and it also must be connected to the display outputs, which are shared with the IGP. Sharing is facilitated by high-performance hardware multiplexers, otherwise known as muxes, that feature inputs for each graphics adapter, a single output, and a control line that tells the multiplexer which input to pass through to the display.
According to Nvidia, a minimum of two muxes are required to connect all the necessary lines for each display output. With the average switchable graphics notebook featuring three video outsthe LVDS LCD interface, an HDMI output, and an old-school VGA portthat’s at least six multiplexers, plus all the extra traces required to connect the auxiliary GPU.
While mux control lines can be activated by software, the act of switching graphics adapters on the fly requires a considerable amount of driver cooperation, especially since this approach was conceived at a time when Microsoft’s reigning operating system, Windows Vista, only played nicely with one graphics adapter at a time. To get switchable graphics working in this environment, Nvidia had to create an “uber” display driver featuring an interposter sitting between the operating system and the Nvidia and Intel display drivers. Vista communicates with this uber driver via standard APIs, but a custom API jointly developed by Nvidia and Intel is used to interface the GMA driver with Nvidia’s interposter. The level of coordination required for this setup was challenging, Nvidia asserts, and the need to ensure compatibility apparently slowed driver updates.
Introducing the Optimus routing layer
Thankfully, Windows 7 is much more accommodating than Vista. Microsoft’s latest OS supports multiple graphics adapters, allowing independent Nvidia and Intel display drivers to coexist peacefully on the same system. These drivers don’t talk to each other, so there’s no need for multi-vendor cooperation, interposters, or custom APIs. In fact, with Windows 7, switching graphics doesn’t even require the user to touch a switchone is built into what Nvidia calls its Optimus routing layer.
This routing layer includes a kernel-level library that maintains associations between certain classes and objects and a corresponding graphics adapter. Graphics workloads that are deemed GPU-worthy are routed to the discrete graphics processor, while simpler tasks are sent to the IGP. The routing layer can process workloads from multiple client applications, juggling the graphics load between the integrated and discrete GPUs.
Nvidia says the Optimus routing layer keys in on three classes of workloads: general-purpose computing apps, video playback, and games. When an application explicitly tries to leverage GPU computing horsepower using a CUDA or OpenCL call, the Optimus routing layer directs the workload to the discrete GPU. The DirectX Video Acceleration (DXVA) calls used by many video playback applications and even the latest Flash beta are also detected automatically and directed accordingly. For games, the routing layer is capable of keying in on DirectX and OpenGL graphics calls.
Not all games that use those two APIs require the power of a discrete GPU, though. Even Solitaire makes DirectX calls, and it’s more than happy running on an antiquated GMA. Rather than assuming that all DirectX or OpenGL calls need to be offloaded to a discrete GPU, Optimus references a profile library to determine whether a game requires extra grunt.
Nvidia already uses profiles to govern the behavior of its SLI multi-GPU teaming scheme, but Optimus is a little more advanced. The profiles are stored in encrypted XML files outside the graphics driver, and users have the option of letting Nvidia push out updates to those profiles automatically. Nvidia has apparently been working on the back end for this profile updating system for quite a while now, and it’s something that’s likely include “other profiles” in the future. (SLI anyone?)
The tinfoil hat crowd will be pleased to note that users are free to disable Optimus’ automatic profile updates. Those who would rather manage graphics preferences themselves can create and modify profiles via the driver control panel, through which any application can be configured to run on the IGP or the discrete GPU. Users can even right-click on an application and manually select which graphics processor it’ll use.
Allowing users to manage their own Optimus profiles is definitely the right thing to do. That said, one element of the profile management scheme could use a little extra work. At present, there’s no way to override the Optimus routing layer’s desire to process DXVA calls on a system’s discrete GPU. One can configure the profile for a video playback application or web browser to use only the integrated graphics processor, but as soon as HD or Flash video playback begins, the GPU persistently takes over. Nvidia is looking at letting users override Optimus’ preferences for video playback acceleration, and I hope that capability is exposed to users soon. GeForce GPUs may do an excellent job of accelerating video playback, but some Intel IGPs share the same functionality and probably consume less power. At least with the old switchable graphics approach, users could turn off the discrete GPU and be confident that it was going to stay off.
Making it work in hardware
While it may not be possible to disable an Optimus system’s discrete GPU completely, the scheme’s routing layer otherwise offers automatic power on demand. Intelligently activating the discrete GPU is a markedly different approach from requiring users to manage their systems’ switchable graphics state, one that should allow mainstream user to reap the benefits of a switchable GPU without even knowing that they have one.
To make Optimus switching truly seamless, additional modifications were necessary on the hardware front. About the only thing that hasn’t changed here is the no-power state that the discrete GPU slips into when it’s not in use. As with earlier switchable graphics setups, an Optimus GPU can be shut down completely. Awakening the GPU from this state takes less than half a second, and that’s the only waiting a user will have to dothe few seconds of screen flickering that accompanied each graphics switch with previous switchable implementations has been completely banished.
Optimus is able to avoid flickering because even when the discrete GPU is handling a graphics workload, the IGP is still being used as the display controller. In Optimus configs, the discrete GPU is only linked to the system over PCI Express; its display controller isn’t connected to anything. To get frames generated on the discrete GPU onto a display, the contents of the GPU’s frame buffer, which reside in dedicated video memory, are copied to the IGP frame buffer that sits in system RAM. This “mem2mem” transfer would typically be done by the 3D engine, according to Nvidia. Unfortunately, the engine would have to stall during the transfer to maintain coherencya hiccup that could cost two to three seconds of latency.
Nvidia obviously doesn’t want to contend with that sort of delay, so current GeForce 200- and 300-series notebook GPUs, plus future Fermi-based and “netbook” graphics chips, all feature a built-in copy engine. This engine uses dedicated logic to feed frames from the discrete GPU’s frame buffer to system memory, and it can function without interrupting the graphics core. Nvidia claims the copious upstream bandwidth available with PCI Express is more than sufficient to allow the copy engine to perform those transfers with just two tenths of a frame of latencywhatever that means. I’m sure some hardcore gamers will insist that they can feel lag, but the effects were imperceptible to me.
With an Optimus system’s IGP serving as the display controller, there’s no need for notebook makers to use multiplexers or to run two sets of display traces. In theory, Opmtius motherboards should be cheaper to produce than boards for past hybrid scehemes, because manufacturers won’t have to pay for the muxes or find board real estate for them. According to Asus, dropping the number of onboard components and the heat that the muxes would normally produce will make it easier to squeeze Optimus into thin-and-light systems.
Optimus in the flesh: Asus’ UL50Vf
Speaking of Asus, the first implementation of Nvidia’s new Optimus tech to arrive in our labs is the company’s UL50Vf. A member of Asus’ UnLimited notebook line, the UL50Vf’s overall design is quite similar to the UL30A and UL80Vt we reviewed last year. Don’t let the recycled exterior fool you, though: the UL50Vf is a full-fledged Optimus implementation, albeit one that uses system and graphics hardware that’s been around for a while.
Based on Intel’s Consumer Ultra Low Voltage (CULV) platform, the UL50Vf sports a Core 2 Duo SU7300 processor running at 1.3GHz alongside an Intel GS45 Express chipset with GMA 4500MHD integrated graphics. The integrated GPU is backed by an Nvidia GeForce G210M with 512MB of dedicated video RAM. Before you get too excited, note that the G210M is the slowest Optimus-capable mobile GPU that Nvidia currently makes. The UL50Vf is quite an affordable system, though. The configuration we tested is slated to sell for $849 when it becomes available in March.
Despite its relatively low price tag, the UL50Vf very much looks and feels like a high-end laptop. The system’s top panel is beautifully finished with brushed black aluminum that won’t attract smudges like the glossy plastics used elsewhere in the chassis. The UL50Vf is reasonably thin and light, weighing in at a little over five pounds and measuring about an inch thick. That includes an optical drive, by the way, plus 4GB of RAM and a Seagate Momentus 5400.6 320GB hard drive.
While arguably a thin-and-light notebook, the UL50Vf actually has quite a large footprint due to its use of a 15.6″ widescreen display. The 16:9 panel features a 1366×768 display resolution and acceptable picture quality, but it’s only average an average screen overall, which is becoming more and more common among affordable laptops in this price range.
At least Asus has made good use of the extra space provided by the Ul50Vf’s larger chassis. The system features a full-size keyboard complete with an inverted-T directional pad and a full numpad. Asus had to shave a couple of millimeters off the width of the numpad keys, but that seems to have been the only concession required.
I quite like the feel of the keyboard overall. There’s enough tactile feedback for speedy touch typing, and while some flex is visible, the keyboard doesn’t feel mushy or vague. However, I’m not as crazy about the dimpled finish on the touchpad. This surface treatment may make it easy to tell when your fingers are on the touchpad, but I find that the indentations actually impede smooth tracking, at least with my fingertips.
Keeping the UL50Vf running is an eight-cell battery rated for 84Wh. The high-capacity cell makes perfect sense for an Optimus system, and it’s hardly a one-off for this particular model. Much of Asus’ UnLimited line is powered by batteries with similar Watt-hour ratings.
You may have to wait until March to get your hands on an UL50Vf, but Asus has a handful of other Optimus designs due out this month. Larger 16″ N61 and 17″ N71 systems based on Intel’s latest Arrandale mobile CPUs are coming on February 11 and 18, respectively. Closer to the end of the month, Asus’ Optimus lineup will expand to include the 13.3″ U30Jc, which will feature an Arrandale CPU alongside a GeForce G310M GPU. Additional Optimus designs employing Clarksdale and CULV Core 2 processors are also set to be released in March.
Asus won’t be the only firm launching Optimus systems, either. Nvidia says it expects more than 50 Optimus-equipped notebooks to be available by this summer.
Optimus in the real world
The big question now, of course, is whether Optimus works. I’ve been playing with a UL50Vf for about a week, and thus far, I’d have to say yes. Nvidia equipped this system with a handy little desktop widget that displays whether the discrete GPU is currently in use. According to that applet, and the general performance I’ve experienced, the Optimus routing layer does a good job of intelligently activating the system’s GeForce GPU when it detects a CUDA app, HD or Flash video playback, or a game. I haven’t been able to detect the fraction-of-a-frame latency associated with Optimus’ frame-buffer copy, nor have I noticed having to wait for the discrete GPU to be awakened from its dormant state.
Nowhere is the graphics power of a discrete GPU needed more than in games, so that’s where I began my testing. Darwinia was the first one I fired up, and as luck would have it, Optimus didn’t recognize the game’s 3D engine. Fair enough. Darwinia may be critically acclaimed, but it’s hardly a mainstream or popular title. Plus, the game actually runs pretty well on the GMA 4500MHD as long as you turn down the pixel shader effects.
Since I prefer to crank the eye candy wherever possible, I opted to create my own Optimus profile for Darwinia. That took all of a few seconds, after which I was in the game under GeForce power. Even with all the in-game detail levels cranked at the notebook’s 1366×768 native resolution, the G210M ran Darwinia at a solid 60 frames per second according to FRAPS. The GMA can manage fluid gameplay at this resolution, too, but only with pixel shader effects turned down and then only at 30 FPS.
The GMA 4500MHD can’t handle Call of Duty 4 at 1366×768, even with the lowest in-game detail levels. Heck, the GMA HD inside next-gen Arrandale CPUs struggles, too. However, with Optimus automatically enabling the system’s discrete GPU, FRAPS reported frame rates between 20 and 40 FPS with all the eye candy turned on at the native display resolution. Enabling anisotropic filtering didn’t slow performance much, either.
Borderlands is quite a bit more demanding than the first Modern Warfare and a complete waste of time if you’re stuck with Intel integrated graphics. The UL50Vf’s lowly GeForce G210M also had a difficult time maintaining smooth frame rates, too. At 1366×768, we had to run the game at medium detail levels with only a few effects turned on. Applying 16X aniso didn’t slow performance much, but FRAPS still spent most of its time displaying frame rates in the low twenties. Dropping the resolution to 640×480 allowed us to enable all of Borderlands‘ pseudo-cell-shaded visual goodness and get frame rates up to the low thirties.
One of the better driving games to hit the PC in recent years, Need for Speed: Shift is yet another console ports that’s a little too demanding for low-end GPUs, let alone integrated graphics processors. Running at our system’s native resolution, with the lowest in-game detail levels, the G210M could only muster about 20 FPS. Scaling the display resolution down to 640×480 didn’t improve frame rates much, either. The game really isn’t playable at even that visibly blocky resolution, and worse, it looks positively ugly with all the details turned down.
Keep in mind, of course, that the GeForce G210M is the weakest GPU to support Optimus. Hopefully, notebook makers will use Optimus on systems with considerably more potent graphics processors, as well.
Video playback tests have become a staple of our notebook reviews, so we fired up a collection of local and streaming videos to see how Optimus would fare. I’ve compiled the results of our tests in a handy little chart below that details the approximate CPU utilization gleaned from Task Manager, a subjective assessment of playback quality, and whether the discrete GPU was active during playback. Windows Media Player was used for local video playback, while Firefox and the Flash 10 beta were used with streaming content.
|CPU utilization||Result||Discrete GPU|
|Star Trek QuickTime 480p||0-2%||Perfect||On|
|Star Trek QuickTime 720p||2-8%||Perfect||On|
|DivX PAL SD||3-12%||Perfect||Off|
|720p YouTube HD windowed||9-22%||Perfect||On|
|YouTube SD windowed||9-19%||Perfect||On|
Optimus activated the discrete GPU for all but our SD video playback test. Each clip played back perfectly smoothly, and CPU utilization never got much higher than 20%, even with YouTube HD streaming a Star Trek trailer at 720p.
Those who keep multiple Flash video tabs open at once should keep in mind that even when a YouTube video has finished playing, Optimus keeps the GPU enabled. Only when the tab is closed or you browse to a page that doesn’t include Flash video is power cut to the discrete GPU. The same applies to HD video playback, at least with the latest version of Windows Media Player built into Win7. As long as WMP has an HD video open, the discrete GPU remains active. Pausing or even stopping the video has no effect. Drag and drop a standard-definition clip into the app, however, and the GPU is shut down immediately.
I asked Nvidia whether Optimus could be made to shut down the discrete GPU if video was stopped or paused, and the company said that would be possible with cooperation from software vendors. Nvidia might also want to look more closely at how CUDA and other general-purpose computing applications interact with Optimus. The Badaboom video transcoding application that Nvidia has used as a CUDA poster child activated the discrete GPU in our Optimus test system before we even had a chance to load a video to transcode. We left Badaboom sitting there with nothing loaded and no transcoding taking place for more than half an hour, and Nvidia’s Optimus widget still showed the discrete GPU as enabled.
Switchable graphics solutions are designed to extend run times by only using a discrete GPU when needed, so battery life tests are certainly in order. Unfortunately, without a directly comparable notebook that doesn’t include Optimus, we can’t properly isolate Nvidia’s new switchable graphics scheme and provide you with a direct comparison.
We can, however, look at the sort of battery life you can expect from the UL50Vf when the discrete GPU isn’t active. Neither of our battery life tests require GeForce horsepower, so what you’re looking at below are run times with Optimus keeping the G210M turned off.
Each system’s battery was run down completely and recharged before each of our battery life tests. We used a 50% brightness setting for the Timeline and UL50Vf, which is easily readable in normal indoor lighting and is the setting we’d be most likely to use ourselves. That setting is roughly equivalent to the 40% brightness level on the K42F, UL80Vt, and Studio 14z, which is what we used for those configurations.
For our web surfing test, we opened a Firefox window with two tabs: one for TR and another for Shacknews. These tabs were set to reload automatically every 30 seconds over Wi-Fi, and we left Bluetooth enabled as well. Our second battery life test involves movie playback. Here, we looped a standard-definition video of the sort one might download off BitTorrent, using Windows Media Player for playback. We disabled Wi-Fi and Bluetooth for this test.
Although the UL50Vf can’t quite top the battery life offered by the underclocked battery-saving configuration of the UL80Vt, the Optimus system offers longer run times than the rest of the field. Nearly eight hours of real-world Wi-Fi web surfing is quite impressive for a budget thin-and-light notebook. Close to six hours of standard-definition video playback capacity is good enough for a few feature-length movies, too.
External operating temperatures
When Optimus shuts down a discrete GPU to preserve battery life, system temperatures are also reduced. To get a sense of just how cool an Optimus-based system can run, we probed the UL50Vf’s external operating temperatures with an IR thermometer placed 1″ from the surface of the system. Tests were conducted after the system had run our web surfing battery life test for a couple of hours.
The UL50Vf has slightly lower operating temperatures than the UL80Vt, which features comparable hardware but an old-school switchable graphics implementation.
Optimus is best thought of as a seamless and smarter switchable graphics solution. The switchable concept was sound long before Optimus came along, but even with recent implementations, users had to contend with brief delays, seconds of screen flickering, and open applications preventing the system from making a switch. And apparently, many users have no clue how switchable graphics works or can’t be bothered to fuss with it.
Optimus nicely solves these problems. The routing layer avoids blocking applications and intelligently determines when to activate the discrete GPU and when to let it sit dormant. With the discrete GPU using the IGP as a display processor, there’s no flickering or delay when the GPU does perk up.
Using profiles to govern Optimus’ reaction to games makes a lot of sense, especially if Nvidia’s new profile distribution scheme works as advertised. Making those profiles easy for users to modify is even better, particularly for enthusiasts who are prone to fiddle with such things. I do hope Nvidia will allow profile preferences to override Optimus’ propensity to offload video playback onto the GPU, thoughat least for long enough for us to do some battery life tests to determine whether a discrete GPU or IGP is preferable for HD and Flash video playback. Optimus could also use a little extra intelligence to determine when CUDA and OpenGL applications are actively using the GPU, rather than sitting idle.
Even with those minor issues, Optimus is by far the best switchable graphics solution I’ve seen. Nvidia may be relegating the GPU to duty as a coprocessor, but within the confines of a notebook, that’s probably exactly what it should be. In fact, Nvidia has even conceded that it’s possible to save some die area by stripping the display controller completely out of its mobile GPUs. With Intel and AMD moving toward integrating GPUs into every CPU, that sort of approach may be the only way to get GeForce graphics into mainstream notebooks.
Optimus shouldn’t just be confined to low-rent GPUs and affordable systems, though. I see no reason why more expensive mobile gaming systems shouldn’t use this switchable graphics tech to combine high-performance mobile GPUs capable of running the latest games with integrated graphics that can surf the web all day long on battery power. Now that’s my kind of Prius.