Nvidia’s big Kepler chip is finally an official product. We first got a peek at the GK110’s architecture way back in May, but products based on it hadn’t been formally introduced until this morning, when Nvidia pulled the curtains back on a pair of cards in its Tesla K20 lineup. As part of the compute-focused Tesla lineup, these cards will be aimed at supercomputing installations and HPC clusters. Here’s a look at their basic specifications, which were the last real mystery remaining about them:
|Tesla K20||706||3.52||1.17||5GB||320-bit||208 GB/s||225W|
|Tesla K20X||732||3.95||1.31||6GB||384-bit||250 GB/s||235W|
Nvidia declined to reveal exact pricing, since these cards will be sold through OEM partners like Cray, but you can expect them to cost a fair bit more than any GeForce.
The big dawg, the Tesla K20X, does in theory exceed one teraflops of peak double-precision performance, as Nvidia told us to expect at GTC this past spring. In fact, its clock rate and peak flops numbers are very close to our best guess from back then. One surprise is the fact that, if our math is correct, even the K20X has one of the GK110’s SMX units disabled, leaving 14 active, for a total of 2688 shader ALUs.
Like prior high-end Teslas, the K20 series supports ECC for external memory via an encoding scheme that occupies a portion of memory bandwidth for checksum storage. Nvidia claims this overhead has been roughly cut in half for Kepler versus the prior-gen Fermi chips, so that between two and 15% of memory bandwidth is occupied by ECC traffic.
The K20X is strictly a server-focused product that will rely on its surroundings to generate airflow for cooling; it will not come with an onboard fan. The K20 will ship with an active fan, making it a candidate for use in workstations and non-OEM servers.
In very much related news, the latest version of supercomputing’s prestigious Top500 list of fastest machines was released this morning, and the Tesla K20-and-Opteron-based Titan supercomputer at Oak Ridge National Labs captured the top spot with 17.59 petaflops of sustained throughput in Linpack. Only the Sequoia system at Lawrence Livermore National Labs, based on IBM’s BlueGene/Q, comes close to matching Titan. The fourth-place entry on the list is under half of Titan’s peak speed, and fifth place is about a quarter.
Nvidia tells us Tesla K20-series cards will be shipping in volume this week via its OEM partners, with general availability to follow this month or next. In fact, the firm claims to have shipped a quantity of K20X cards capable of 30 petaflops in the past 30 days.