Intel’s high-performance 3D Xpoint (say crosspoint) memory technology promised far-reaching changes for storage, memory, and computer architecture when the company first took the wraps off the technology in mid-2015. Of course, big promises begot high expectations, and despite a hunger for more information, we’ve heard relatively little about how 3D Xpoint will perform in the intervening time. That all changes today. 3D Xpoint will power Intel’s first Optane device, the SSD DC P4800X, and that device demonstrates the wide-ranging potential of Optane to change how we think about the balance between memory and storage in a system.
Optane from the ground up
The SSD DC P4800X launching today packs 375GB of 3D Xpoint storage on a PCIe add-in card form factor. Intel is keeping the physical details of its Xpoint media under wraps, but we do know some of the basics of the stuff now. The first Xpoint dies are being fabricated on a 20-nanometer process and can hold 128 Gbits each. Intel says that because the Xpoint medium itself relies on low-resistance and low-capacitance interconnects in the back end of the device, it can switch a thousand times faster than NAND. Consequently, Xpoint delivers much lower access times and much higher performance.
3D Xpoint also offers much finer access granularity than NAND. Instead of working on pages with many kilobytes of data as NAND does, Intel says the structure of the Xpoint device allows for bit-level access. Practically, that means Xpoint can work with words of data, so it’s a natural fit for delivering cache lines like DRAM does. Even though it’s not as fast as DRAM, Xpoint can be manufactured with 10 times the density of that memory, giving it NAND-class capacity. Intel also says that chip-for-chip, Xpoint is a thousand times more durable compared to 3D NAND. Of course, Xpoint is non-volatile like NAND, as well.
Xpoint is just one piece of the puzzle for Optane products, however. Intel has developed a custom controller to take advantage of the unique properties of the Xpoint material. For the DC P4800X, the controller talks over PCI Express using the NVMe protocol. Like an SSD controller, Optane products use multiple channels with multiple dies per channel to exploit parallelism for greater performance. However, Optane’s controller ASIC handles both data and commands exclusively in hardware to deliver the full performance of Xpoint media (at least the performance possible on the PCI Express 3.0 bus). The Optane controller can also spread a 4KB read operation over all of its dies, which Intel contrasts with the single-die read handling of NAND.
Thanks to Xpoint’s fine-grained access characteristics, Optane devices can also write in place on the underlying medium, unlike NAND’s read-modify-write cycle at the block level. In turn, Xpoint should match its higher performance with high endurance.
With a drive as fast as the P4800X, Intel believes that up to 30 drive writes per day is going to be adequate endurance for the types of applications it believes will benefit from Xpoint’s performance. With a rated maximum of 12.3 petabytes written, that means the 375GB DC P4800X drive could endure 30 DWPD for three years of constant use by our calculations. It’s worth noting that the projected endurance figures in the graphs above come from the 750GB P4800X, not the 375GB drive. Larger drives should enjoy a greater total petabytes written figure if Xpoint endurance scaling is similar to that of NAND. The best-case NAND endurance figure, on the other hand, comes from Intel’s DC P3700 800GB and its 14.6 PBW figure, good for 10 DWPD.
The 375GB P4800X uses a seven-channel controller with four dies of 3D Xpoint per channel for 28 dies total. Although Xpoint devices don’t require overprovisioning like NAND SSDs, the P4800X still reserves some extra space for ECC, firmware, and reallocation of bad regions of its total storage pool.
High speeds with light loads—and consistent performance nearly everywhere
Because of these properties, Optane products have exceptionally low latency for persistent storage devices, and it’s that characteristic—not sequential performance—that you’ll hear emphasized in Intel’s marketing materials for Optane. In fact, Intel won’t be providing sequential performance specifications for Optane SSDs in official materials, because it believes those figures will be misused by its competitors to misrepresent Optane SSD performance compared to NAND.
Intel justifies this stance by noting that NAND SSD makers (itself included) can only deliver their specified performance numbers by testing workloads with unrealistically high queue depths like 128. Outside of those synthetic situations, the company claims its internal testing shows that even heavy server workloads often don’t exceed QD16, and that will be especially true of Optane thanks to its symmetrical read and write performance. That symmetry means read requests won’t be held up behind higher-latency writes, as they would be with NAND controllers. In fact, Intel says that those who need large sequential numbers at high queue depths should just use NAND devices.
Instead, the Optane SSD DC P4800X really shines in low-queue-depth random workloads. Compared to even the DC P3700 NAND SSD, Intel says the P4800X offers eight times the performance of the NAND drive at QD1, and that lead still holds at about five times the random performance of NAND as queue depths increase. In fact, the P4800X saturates at about QD12, and Intel says that the only characteristic that changes with deeper queues on this drive is its service latency.
That’s not the end of the P4800X’s performance benefits, however. Not only does the Optane drive’s low-QD performance shine compared to NAND, the consistency of that performance could also be revolutionary compared to today’s SSDs. TR readers are already quite familiar with the importance of 99th-percentile performance thanks to our Inside the Second graphics-card benchmarking methods (themselves inspired by the use of that metric in server benchmarking), and Intel is deploying a similar measure to demonstrate the performance of the P4800X.
Intel has devised what it claims is a better method of demonstrating the unique benefits of Optane in those worst-case scenarios: the “write pressure” test, in which the latency for a read operation is measured while the drive is servicing a sustained quantity of write IOPS. The stairstep line in the rather complex chart above shows the random write IOPS rate in MB/s. The squiggly blue line is the average read response time in microseconds for the DC P3700, while the orange line near the x-axis demonstrates the average read response time for the DC P4800X.
Although it’s not shown in the write pressure graph, Intel says it saw similar average read response times for the P4800X all the way out to a remarkable 2GB/s of random write pressure. That consistency is a huge deal for Optane application performance, just as a low 99th-percentile frame time is for gaming. Even if one considers 99.999% response time for the DC P4800X on the logarithmic scale above, it’s still well below the average response time for the NAND DC P3700.
As just one example of the way the DC P4800X can boost application performance, Intel presented an example case of a two-server MySQL setup with a database too large to fit in RAM. Running Sysbench 0.5 with a 70-30 read-write split, and with the data stored on a 400GB Intel DC P3700, the rack could handle only 1395 transactions per second. With the 375GB DC P4800X, that number soars to 16480 transactions per second with similar 99th-percentile latency. That’s an impressive performance boost from mostly the same hardware.
Extending DRAM with Intel Memory Drive
Even though the DC P4800X is an NVMe device, Intel will begin providing a way to use it as a method of extending system memory while it develops Optane DIMMs for future systems.
This middleware, called Intel Memory Drive, serves as a hypervisor of sorts that will sit between the operating system and whatever complement of main memory and Optane devices are installed on the host system. Memory Drive will allow flexible provisioning of Optane-devices-as-storage and Optane-devices-as-memory to the system administrator. This technology requires no reprogramming of the operating system or applications. The Memory Drive and Optane suite will only work on Intel’s Xeon platforms.
Optane can extend the capacity of a system’s DRAM thanks to its reliably low latency. That’s because paging to an Optane SSD doesn’t carry the uncertainty or nearly as much of the performance penalty of paging to NAND devices or a hard disk. To prove this point, Intel showed that GEMM, a type of workload commonly used in deep learning, can run (in a highly optimized form) at 2605 GFLOPS with 128GB of DRAM and 1.5TB of P4800X capacity installed, compared to 2322 GFLOPs with 768GB of DRAM alone installed. At $6080 for the Optane SSDs and roughly $2100 for a 128GB DDR4-2400 DIMM, the cost of crunching those large data sets could be more favorable with Optane, too. Achieving 768GB of DRAM with 128GB DDR4-2400 ECC DIMMs would require almost $13,000 of memory.
Even in less favorable workloads for Optane, extending DRAM with P4800X SSDs and Memory Drive could let companies trade some of DRAM’s performance for the ability to get huge data sets much closer to the CPU than DRAM capacities currently allow. Intel says that a two-socket Xeon system provisioned with DRAM alone could reach 3TB of caching space, while a combination of DRAM, Memory Drive and Optane P4800X SSDs would allow that system to pack in as much as 24TB of memory-class space. A four-socket system would double that figure to 48TB. Based on our calculations, those figures are only possible with some combination of 1.5 TB DC P4800X drives, but even if those drives scale linearly with the 375GB drive’s roughly $4-per-gigabyte cost, they could still economically boost the amount of memory-class cache close to the CPU in ways that DRAM alone can’t.
All told, the performance improvements in Optane lead Intel to call the DC P4800X the industry’s most responsive datacenter SSD, and we see no reason to contest the point. The 375GB DC P4800X will be available for $1520 starting today, and the drive should become more broadly available over the second half of this year. Intel will begin offering a 750GB version of the drive in the second quarter of this year, and a 1.5TB drive will follow in the second half of 2017. U.2 versions of these drives will also be available in the second quarter of this year for the 375GB version. U.2 versions of the larger-capacity drives will become available in the second half of the year.