Crucial’s P1 500-GB QLC NVMe SSD reviewed
Just a few days ago, we reviewed a quad-level-cell SSD for the first time ever. The experience left us wanting more. Not more QLC, to be clear. We want more of the comforting, familiar march of virtually indistinguishable drives based on various manufacturers’ 3D TLC NAND that we’d been living through till recently. Soon, we may be looking back on those days wistfully. But you can’t always get what you want, so today we’ve got another QLC drive on hand. Meet Crucial’s P1 500 GB.
|Capacity||Max sequential (MB/s)||Max random (IOps)|
The 860 QVO’s role was cut and dried: to offer a sort of middle ground between smaller, faster SSDs and larger, cheaper mechanical storage. With time (and some heavy price cuts), drives like it could conceivably displace mechanical storage entirely. But unlike the QVO, the P1 isn’t a 2.5″ SATA drive. It’s a modern, NVMe-enabled, M.2 gumstick ready to monopolize four of your precious PCIe lanes, and in fact, it’s Crucial’s first NVMe SSD ever.
The NVMe badge usually goes hand-in-hand with a higher set of performance expectations than average, but our experience with Samsung’s 860 QVO might leave one wondering why anybody would bother producing a PCIe drive bottlenecked by dense, slow QLC NAND. The answer, as always, is caching. QLC’s raw speeds are indeed too low to take advantage of all that juicy bandwidth, but toss in some sophisticated pseudo-SLC capabilities and you might just have something worth sacrificing an M.2 slot for.
The version of Crucial’s Dynamic Write Acceleration implemented in the P1 uses the typical sort of static buffer allocation that all sorts of drives use as a baseline, but Crucial’s tech can also commandeer up to 14% of a P1 drive’s total capacity if the user has the space free. If this sounds familiar, that’s because it’s basically the same scheme as the Intelligent TurboWrite system we were just talking about in the 860 QVO. It wasn’t enough to save the QVO from low placement in our rankings, but maybe Crucial’s flavor will fare better.
Two packages of Micron’s 64-layer 3D QLC flash lie under the P1’s sticker, alongside a DRAM cache and Silicon Motion’s SM2263 controller. The SM2263 is only a four-channel controller, and that could be the first signal that Micron doesn’t expect this drive to beat any records. Even many affordable SATA drives can make good use of eight channels. The underside of the PCB is bare aside from regions marked off for two more packages and another slice of DRAM. Presumably the higher-capacity P1s make use of this space. Incidentally, this combination of NAND and controller is the exact same as that in the Intel 660p, which holds the possibly dubious honor of being the first consumer QLC drive to hit the market.
In spite of QLC’s disadvantages, Micron is bullish on the durability of the P1. It backs the drive with the same five-year warranty that the company extends to its MX500 series. The endurance rating for the 500 GB drive is a solid 100 terabytes written, which is down less than you might expect from the MX500 500 GB’s 180 TBW.
The P1 500 GB is available from Newegg, Amazon, or directly from Crucial’s website for $110. That’s fairly inexpensive, but as we saw with the QVO, cheap QLC isn’t too compelling when there’s cheaper TLC to be had. But we can’t assume the P1 will turn in the same lackluster showing in our test suite that the 860 QVO did. Time to dust off IOMeter and see what Micron’s 3D QLC can do.
IOMeter — Sequential and random performance
IOMeter fuels much of our latest storage test suite, including our sequential and random I/O tests. These tests are run across the full capacity of the drive at two queue depths. The QD1 tests simulate a single thread, while the QD4 results emulate a more demanding desktop workload. For perspective, 87% of the requests in our old DriveBench 2.0 trace of real-world desktop activity have a queue depth of four or less. Clicking the buttons below the graphs switches between results charted at the different queue depths. Our sequential tests use a relatively large 128-KB block size.
Starting off, the P1’s sequential read speeds are peppy. The PCIe P1 offers better read performance than SATA drives, but it can’t keep up with the NVMe TLC pack. As for sequential writes… oy vey, to borrow from the Yiddish. Dynamic Write Acceleration has succumbed to the challenges of our IOMeter test methods. Our full-drive write often foils drives’ caching schemes and exposes the pokey raw speeds concealed beneath them. That’s precisely what happened in our IOMeter tests of the 860 QVO.
In this case, however, the performance falloff is even worse than that. The SM2263 has no direct-to-flash capability, meaning it can’t intelligently decide to bypass the saturated pseudo-SLC cache and commit directly to QLC. So the paltry speeds we’re seeing represent a cascading effect of queued writes waiting for previous writes to finish and corresponding cache to be cleared before they can be serviced.
Our real-world tests later on don’t fill the drive’s entire span, so there’s still hope that the P1 might perform better under less strenuous conditions. But it won’t escape the effect these figures have on its overall ranking.
Random read response times are slower than most of the other PCIe drives, but way faster than the 860 QVO’s. Random write response times are solid, falling towards the middle of the pack.
Sequential writes were a disaster for this drive, but elsewhere, the P1 seems to perform as we expected: about as well as SATA TLC drives, but worse than PCIe TLC drives. Let’s throw some more tests at it.
Sustained and scaling I/O rates
Our sustained IOMeter test hammers drives with 4-KB random writes for 30 minutes straight. It uses a queue depth of 32, a setting that should result in higher speeds that saturate each drive’s overprovisioned area more quickly. This lengthy—and heavy—workload isn’t indicative of typical PC use, but it provides a sense of how the drives react when they’re pushed to the brink.
We’re reporting IOPS rather than response times for these tests. Click the buttons below the graph to switch between SSDs.
The P1 peaks at a rate lower than some drives’ steady-state speeds, but we didn’t expect this QLC drive to be a speed demon. On the plus side, the speed the P1 does achieve manages to hold out for almost 15 minutes before it declines to its steady-state speed.
In the case of the RC100’s uninspired performance, we can blame its omission of a dedicated DRAM cache. Even though the P1 does have DRAM, it seems the performance of Micron’s QLC lands the drive in roughly the same ballpark as the RC100, at least as far as sustained writes are concerned. Again, this appear to be an artifact of the drive’s pseudo-SLC cache implementation being overwhelmed. The raw write speed of the QLC itself is surely better than the 412 IOps figure from the graph.
Our final IOMeter test examines performance scaling across a broad range of queue depths. We ramp all the way up to a queue depth of 128. Don’t expect AHCI-based drives to scale past 32, though—that’s the maximum depth of their native command queues.
For this test, we use a database access pattern comprising 66% reads and 33% writes, all of which are random. The test runs after 30 minutes of continuous random writes that put the drive in a simulated used state. Click the buttons below the graph to switch between the different drives. Note that each drive uses a different scale for IOPS to allow us to view the shape of its curves.
No meaningful scaling here. In absolute terms, the oscillations are scarcely more than noise. Let’s compare it with some of the other drives.
Mainstream TLC drives like Intel’s SSD 760P and Crucial’s own MX500 are quite capable of scaling gracefully as queue depth increases. But the P1 chugs along side-by-side with the QVO, scarcely registering the growing buildup of outstanding access operations.
Our IOMeter write tests were rough on the P1. But as we mentioned before, this drive still has a shot at redeeming itself in the real-world workloads it’s intended for. We may have disarmed its Dynamic Write Acceleration thus far, but the less demanding nature of typical client access patterns might allow it to flourish yet. On to RoboBench.
TR RoboBench — Real-world transfers
RoboBench trades synthetic tests with random data for real-world transfers with a range of file types. Developed by our in-house coder, Bruno “morphine” Ferreira, this benchmark relies on the multi-threaded robocopy command build into Windows. We copy files to and from a wicked-fast RAM disk to measure read and write performance. We also cut the RAM disk out of the loop for a copy test that transfers the files to a different location on the SSD.
Robocopy uses eight threads by default, and we’ve also run it with a single thread. Our results are split between two file sets, whose vital statistics are detailed below. The compressibility percentage is based on the size of the file set after it’s been crunched by 7-Zip.
|Number of files||Average file size||Total size||Compressibility|
|Media||459||21.4 MB||9.58 GB||0.8%|
|Work||84,652||48.0 KB||3.87 GB||59%|
RoboBench’s write and copy tests run after the drives have been put into a simulated used state with 30 minutes of 4KB random writes. The pre-conditioning process is scripted, as is the rest of the test, ensuring that drives have the same amount of time to recover.
The media set is made up of large movie files, high-bitrate MP3s, and 18-megapixel RAW and JPG images. There are only a few hundred files in total, and the data set isn’t amenable to compression. The work set comprises loads of TR files, including documents, spreadsheets, and web-optimized images. It also includes a stack of programming-related files associated with our old Mozilla compiling test and the Visual Studio test on the next page. The average file size is measured in kilobytes rather than megabytes, and the files are mostly compressible.
Let’s take a look at the media set first. The buttons switch between read, write, and copy results.
The P1’s got some life to it after all. When its pseudo-SLC cache is used as intended, the QLC drive just takes off. It doesn’t keep up with Intel’s 760P or Adata’s XPG SX8200 (two budget TLC favorites), but it puts some distance between itself and the SATA drives. Let’s see if the work set changes the outlook.
Again, the P1 outpaces the SATA pack and even snags the record for read speeds with one thread. Not bad, P1, not bad at all.
RoboBench was an unqualified success for the P1. There are much faster drives, to be sure, but a QLC drive was never going to blow our socks off. The P1’s pseudo-SLC cache functions perfectly well when tasked with standard client I/O. Our next and last page of tests will boot Windows 10 from the P1 and see what happens. These are the tests least likely to rock the boat, but it can’t hurt to check.
Until now, all of our tests have been conducted with the SSDs connected as secondary storage. This next batch uses them as system drives.
We’ll start with boot times measured two ways. The bare test depicts the time between hitting the power button and reaching the Windows desktop, while the loaded test adds the time needed to load four applications—Avidemux, LibreOffice, GIMP, and Visual Studio—automatically from the startup folder. These tests cover the entire boot process, including drive initialization.
The P1 boots up just as quickly as the next guy, but it would have been a shock if it didn’t. Let’s check in on application load times.
Next, we’ll tackle load times with two sets of tests. The first group focuses on the time required to load larger files in a collection of desktop applications. We open a 790-MB 4K video in Avidemux, a 30-MB spreadsheet in LibreOffice, and a 523-MB image file in the GIMP. In the Visual Studio Express test, we open a 159-MB project containing source code for Microsoft’s PowerShell.
Load times for the first three programs are recorded using PassMark AppTimer. AppTimer’s load completion detection doesn’t play nice with Visual Studio, so we’re still using a stopwatch for that one.
The P1 looks indistinguishable from any other SSD in our productivity apps.
For game loads, the P1 looks a tad sluggish in Batman, but it’s not a deviation that raises our eyebrows. The QLC revolution might be starting to convince me. My game library has grown annoyingly storage-heavy over the years, and giant QLC drives to dump the files onto would beat the pants off of spinning rust.
The P1 acquitted itself quite well in our OS boot and application load tests. We’ve got a breakdown of our test methods on the next page, but feel free to skip ahead if you want the final verdict on this drive.
Test notes and methods
Here are the essential details for all the drives we tested:
|Adata XPG SX8200 480 GB||PCIe Gen3 x4||Silicon Motion SM2262||64-layer Micron 3D TLC|
|Crucial MX500 500 GB||SATA 6Gbps||Silicon Motion SM2258||64-layer Micron 3D TLC|
|Crucial P1 500 GB||PCIe Gen3 x4||Silicon Motion SM2263||64-layer Micron 3D QLC|
|Intel X25-M G2 160 GB||SATA 3Gbps||Intel PC29AS21BA0||34-nm Intel MLC|
|Samsung 850 EVO 1 TB||SATA 6Gbps||Samsung MEX||32-layer Samsung TLC|
|Samsung 860 EVO 1 TB||SATA 6Gbps||Samsung MJX||64-layer Samsung TLC|
|Samsung 860 QVO 1 TB||SATA 6Gbps||Samsung MJX||64-layer Samsung QLC|
|Samsung 970 EVO 1 TB||PCIe Gen3 x4||Samsung Phoenix||64-layer Samsung TLC|
|Toshiba RC100||PCIe Gen3 x2||Toshiba||64-layer Toshiba BiCS TLC|
The SATA SSDs were connected to the motherboard’s Z270 chipset. The PCIe drives were connected via one of the motherboard’s M.2 slots, which also draw their lanes from the Z270 chipset.
We used the following system for testing:
|Processor||Intel Core i7-6700K|
|Motherboard||Gigabyte Aorus Z270X-Gaming 5|
|Memory size||16 GB (2 DIMMs)|
|Memory type||Corsair Vengeance LPX DDR4 at 2133 MT/s|
|System drive||Corsair Force LS 240 GB with S8FM07.9 firmware|
|Power supply||Rosewill Fortress 550 W|
|Operating system||Windows 10 x64 1803|
Thanks to Gigabyte for providing the system’s motherboard, to Intel for the CPU, to Corsair for the memory and system drive, and to Rosewill for the PSU. And thanks to the drive makers for supplying the rest of the SSDs.
We used the following versions of our test applications:
- IOMeter 1.1.0 x64
- TR RoboBench 0.2a
- Passmark AppTimer 1.0
- Avidemux 2.7.1 x64
- GIMP 2.10.0
- LibreOffice 22.214.171.124
- Visual Studio Community 2017 15.7.4
- Batman: Arkham Origins
- Tomb Raider
- Middle Earth: Shadow of Mordor
Some further notes on our test methods:
To ensure consistent and repeatable results, the SSDs were secure-erased before every component of our test suite. For the IOMeter database, RoboBench write, and RoboBench copy tests, the drives were put in a simulated used state that better exposes long-term performance characteristics. Those tests are all scripted, ensuring an even playing field that gives the drives the same amount of time to recover from the initial used state.
We run virtually all our tests three times and report the median of the results. Our sustained IOMeter test is run a second time to verify the results of the first test and additional times only if necessary. The sustained test runs for 30 minutes continuously, so it already samples performance over a long period.
Steps have been taken to ensure the CPU’s power-saving features don’t taint any of our results. All of the CPU’s low-power states have been disabled, effectively pegging the frequency at 4.0 GHz. Transitioning between power states can affect the performance of storage benchmarks, especially when dealing with short burst transfers.
The test systems’ Windows desktop was set at 1920×1200 at 60 Hz. Most of the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
The P1’s run through IOMeter was fraught with missteps, but the drive handled the rest of the test suite with aplomb. We distill the overall performance rating using an older SATA SSD as a baseline. To compare each drive, we then take the geometric mean of a basket of results from our test suite.
There’s two ways to interpret these results. One is that a PCIe drive has no right to be so slow, QLC is the worst, you can pry TLC from my cold, dead hands, etc. (substitute TLC for MLC or SLC depending on your personal level of snobbery). The other one is that the drive is just fine, and those lunatics at TR put this poor gumstick through tests no sane person would ask of a hundred-dollar consumer SSD. The drive was great in real-world tests, and that’s all most people shopping client SSDs should care about.
And both perspectives are correct. The IOMeter tests we run ask far more of lowly consumer drives than their makers expect them to be used for. QLC may be verifiably worse for performance than TLC or MLC, but does it matter if it’s still good enough for 99% of use cases? The P1’s pseudo-SLC caching works great for ordinary workloads. So while the raw speeds of the QLC are slower than those from less dense NAND implementations, the end user will rarely be subjected to them.
But it’s reasonable to expect a lower price in exchange for an empirically weaker drive. To quote a gerbil in the height of his wroth circa 2012: “TLC is a lower-performing product that offers 50% more storage for the same manufacturing cost. Unless I see 50% better cost/GB, we are being short-changed, and the lower performance is just adding insult to the injury.” It may be an oversimplification, but change “TLC” to “QLC” and it’s hard to argue with that sentiment. So let’s see where this QLC drive lands in today’s market. In the graph below, the most compelling position is toward the upper left corner, where the price per gigabyte is low and performance is high.
Even though we’re well past the Black Friday and Cyber Monday feeding frenzies, SSD prices are still deliciously low, and that proves to be the P1’s undoing. The drive is selling at its suggested price of $110 at Newegg, but that’s just not good enough. Just as there were superior alternatives to the 860 QVO, there are superior alternatives to Crucial’s P1. If you’re looking to buy a PCIe SSD, Adata’s SX8200 and Samsung’s 970 EVO are both going for about the same 22 cents per gigabyte that the Crucial demands. And both of those drives put up speed figures that the P1 simply can’t touch.
Crucial’s five-year warranty inspires more confidence than Samsung’s three-year warranty on its 860 QVO does, but buyers might nonetheless be skittish about the reduced endurance of QLC. For what it’s worth, none of the few dozen Amazon reviews the drive has received are complaining about catastrophic failure. Regardless, it will take a bit more time and much more aggressive pricing from drive vendors before we can in good conscience recommend that anyone take the QLC plunge.
All things considered, Crucial’s P1 actually gave us a nice surprise on the performance front. It couldn’t withstand our crushing barrage of IOMeter tests, but not every drive needs to be that robust. Its caching tricks allowed it to maintain an appreciable lead over TLC SATA drives in real-world file transfers, and it delivered the snappy boot and load speeds that all SSDs enjoy. Its problem is purely one of pricing. Until drives with QLC flash start getting hit with the discount stick, our advice is the same as it was a few days ago: if you need an SSD and can afford a QLC drive, you should just take that money and buy a faster TLC bit bucket.