I spoke recently with Ben de Waal, NVIDIA's Vice President of GPU software, and he revealed that NVIDIA has plans to produce multithreaded ForceWare graphics drivers for its GeForce graphics products. Multithreading in the video driver should allow performance increases when running 3D games and applications on dual-core CPUs and multiprocessor PCs. De Waal estimated that dual-core processors could see performance boosts somewhere between five and 30% with these drivers.
Most imminent on the horizon right now is ForceWare release 75, which will bring a number of improvements for SLI performance and 64-bit Windows, among other things, but release 75 will not be multithreaded. The next major iteration of the driver, release 80, is slated to bring support for multiple threads. We may not see this version for a few months; NVIDIA hasn't given an exact timetable for the completion of release 80.
Out of curiosity, I asked de Waal why NVIDIA's drivers don't already take advantage of a second CPU. After all, the driver is a separate task from the application calling it, and Hyper-Threaded and SMP systems are rather common. He explained that drivers in Windows normally run synchronously with the applications making API calls, so that they must return an answer before the API call is complete. On top of that, Windows drivers run in kernel mode, so the OS isn't particularly amenable to multithreaded drivers. NVIDIA has apparently been working on multithreaded drivers for some time now, and they've found a way to fudge around the OS limitations.
De Waal cited several opportunities for driver performance gains with multithreading. Among them: vertex processing. He noted that NVIDIA's drivers currently do load balancing for vertex processing, offloading some work to the CPU when the GPU is busy. This sort of vertex processing load could be spun off into a separate thread and processed in parallel.
Some of the driver's other functions don't lend themselves so readily to parallel threading, so NVIDIA will use a combination of fully parallel threads and linear pipelining. We've seen the benefits of linear pipelining in our LAME audio encoding tests; this technique uses a simple buffering scheme to split work between two threads without creating the synchronization headaches of more parallel threading techniques.
Despite the apparent gains offered by multithreading, de Waal expressed some skepticism about the prospects for thread-level parallelism for CPUs. He was concerned that multithreaded games could blunt the impact of multithreaded graphics drivers, among other things.
|Autodesk uses HoloLens to bring 3D models into mixed reality||7|
|Chipworks takes the lid off Apple's A9X SoC||14|
|Cyber Monday deals: Nvidia's Shield TV for $150 and more||14|
|AMD pledges fix for low fan speeds caused by Crimson Edition drivers||28|
|Lenovo's gaming PCs are getting an infusion of Razer DNA||17|
|In the lab: FLIR's One thermal camera||47|
|Black Friday deals: Dell's U3415 curved monitor for $650 and more||39|
|Abu Dhabi government fund may be shopping GlobalFoundries||71|
|Asus goes for the gold with its 20th Anniversary GTX 980 Ti||10|