Single page Print

Khronos' President talks OpenCL, DX Compute Shader, and more

The present, the future, and the competition

We recently had the opportunity to speak with Neil Trevett, who fills positions as both the Khronos Group's President and Nvidia's VP of Embedded Content. Consumers might not hear the Khronos name too often, but the organization is responsible for setting and updating a number of key standards: among them OpenGL, OpenGL ES, and most recently, OpenCL.

It was that last standard we wanted to talk about. In December, Khronos completed the first version of OpenCL, and all major players in the graphics market—Intel, Nvidia, and AMD—ratified it.

Once all of those companies release compliant drivers, developers will be able to write apps that tap into the parallel computing resources of any compliant GPU from any vendor. That's a pretty major departure from older GPU compute application programming interfaces (APIs) like C for CUDA and Brook+, which are each tied to a particular vendor's hardware (Nvidia for the former, AMD for the latter).

To break the ice, we asked Mr. Trevett to update us on what's going on with OpenCL. Is Khronos doing anything new with the API? Here's what he told us:

As you know, OpenCL 1.0 was released back in Siggraph Asia last year, so actually it's only been around six months since the 1.0 specification was announced. You've probably seen the announcements that Apple made around their WWDC event. They're beginning to explain how the Snow Leopard OS is going to use OpenCL to unleash the power of the GPU for a wide range of applications inside Snow Leopard.

At Nvidia, we are the first GPU company to ship beta OpenCL drivers. Actually, now we're shipping fully conformant OpenCL drivers for our range of GPUs. So Nvidia is committed to timely shipment of . . . OpenCL implementations on our GPUs.

We are working, of course, on the next OpenCL specification. Because OpenCL is so new, we are in the mode of taking input from the developer community before we make any final decisions on what's going to be included in the next generation of OpenCL and the precise timing. We're not going to wait too long, but we do need to let the developer community kick the tires on OpenCL 1.0 before we head off with a next generation. That's going to happen over the next few months. Siggraph is a good opportunity to get interaction with the developer community.

Will we see many OpenCL-enabled consumer apps from major application vendors?

Yeah, absolutely. I think it's interesting; you can split the types of apps down to their individual categories. But I think as GPU compute becomes more widely available, I think over time you're gonna see these historical categories begin to break down. I think you're gonna see a very innovative ebb and flow between the different application categories, and see new types of applications emerge that weren't possible before they could tap into the parallel computing inside GPUs.

So right now, these traditional parallel computing communities are coming to OpenCL. We had the high-end [high-performance computing]—the labs and engineering departments—doing large compute projects. They're using OpenCL all the way down to consumer applications. The most obvious parallelization opportunity is of course with images and video. So I think you'll see a wide range of imaging applications plugging into the parallel GPU. You can see the beginnings of that with things like Photoshop that have traditionally used CUDA. You can see a wide range of imaging applications tapping into OpenCL; video even more so—different transcoding, video enhancements, quality enhancements, even image-recognition types of applications. So your videos will be auto-metadata-tagged eventually with image recognition algorithms running on the GPU.

I think [this is] the first wave of making supercomputing performance available on every desktop and laptop, and it's gonna take more than six months for the developer community to really get a feel of what's possible. And I think it's going to unleash a wave of innovation that we haven't seen before.

What about the short term? We've recently seen video transcoders from Elemental and Cyberlink that use GPU computing through proprietary APIs. Are those apps going to be ported to OpenCL? Will we see other players join in?

I shouldn't put words in vendors' mouths. There are a lot of vendors using CUDA today. Some of them might stick with CUDA, a large number of them I think will move to OpenCL so they can tap into GPU compute across a broad range of platforms. From Nvidia's point of view, we're happy for them to use CUDA or OpenCL; we're giving the choice to the application developers. It all taps down to the CUDA architecture running on our GPUs. So, it's a just a choice of different programming techniques that we can offer to the developer community.

I think having a standard API that is portable across multiple vendors' silicon will grow the total market for applications that use GPU compute. I think it's a necessary evolutionary step to making parallel computation just pervasively available everywhere. Of course, it's gonna happen first on the desktop, but you might've noticed that OpenCL also has an embedded profile—OpenCL "ES" if you like—in the 1.0 specification. So, over the next few years, you're gonna see OpenCL embedded profiles used alongside OpenGL ES. So it's not just high-end servers and high-end desktops; it's gonna be laptops, netbooks, and mobile devices over the next few years that tap into parallel computation.

So, we'll see OpenCL in cell phones. Would that involve, say, the graphics portion of a device's system-on-a-chip?

Yeah. It's not here today, it's definitely— We're preparing for the future here, but I think it is inevitable. You can look at the evolution of mobile graphics silicon. It is tracking the desktop silicon, so at some point in the not-too-distant future, the GPUs will be programmable enough to support CUDA or OpenCL programmability. And that's going to enable another wave of innovation, having the power of a supercomputer in the palm of your hand in a device that has multiple sensors, such as video and still cameras, and will be always connected. [It] is going to enable so new classes of applications that you haven't seen before.

To sum up, OpenCL use may grow slowly at first, and initial applications might not necessarily be groundbreaking. As developers get acquainted with the API and Khronos keeps improving it, though, Trevett thinks we can look forward to exciting new things (like automatic metadata-tagging of videos) and a spread into the world of handheld devices.

That's all well and good, but OpenCL isn't the only API in town. We just mentioned C for CUDA and Brook+, and Microsoft is also cooking up DirectX 11 Compute Shader—a vendor-independent API that also promises GPU computing for all. At Computex in June, AMD and Nvidia both demonstrated an automatic, profile-based video transcoding feature in Windows 7 that used DirectX Compute Shader. Let's find out what Khronos thinks about all of these APIs.