Good news, GPU computing developers. Nvidia has announced that a release candidate of its CUDA 4.0 developer toolkit will be available later today. The new toolkit will include several interesting additions and, in Nvidia's words, "was designed to make parallel programming easier."
Major new features will include:
- NVIDIA GPUDirect™ 2.0 Technology -- Offers support for peer-to-peer communication among GPUs within a single server or workstation. This enables easier and faster multi-GPU programming and application performance.
- Unified Virtual Addressing (UVA) -- Provides a single merged-memory address space for the main system memory and the GPU memories, enabling quicker and easier parallel programming.
- Thrust C++ Template Performance Primitives Libraries -- Provides a collection of powerful open source C++ parallel algorithms and data structures that ease programming for C++ developers. With Thrust, routines such as parallel sorting are 5X to 100X faster than with Standard Template Library (STL) and Threading Building Blocks (TBB).
I'm no GPU computing expert, but it sounds like these changes could bring about some nice performance boosts to CUDA apps. There are more additions in store, too, as you'll see in the official announcement.
The CUDA 4.0 RC toolkit should be available to members of Nvidia's CUDA Registered Developer Program from the Nvidia Developer Zone site. If you haven't joined the developer program yet, you can do so here.
|Lenovo ThinkCentre and ThinkPad machines pack AMD PRO APUs||16|
|Seagate 5TB BarraCuda and 2TB FireCuda drives are big and speedy||7|
|Nvidia licenses Rambus' DPA tech for side-channel data leak prevention||9|
|iOS 10.1 update includes portrait mode beta for iPhone 7 Plus||3|
|Biostar belatedly announces GTX 1060 graphics cards||12|
|HyperX Alloy keyboard gets lean and mean for FPS gaming||8|
|AMD drops prices on the Radeon RX 460 and RX 470||50|
|Reports: Radeon RX 470D is a budget Polaris card for China||9|
|Examining reports of slow write speeds on the 32GB iPhone 7||33|
|A real "console monitor" would be 720p @ 30 Hz ;P||+64|