Usacomp2k3 wrote:What typical use pattern would reflect that many sustained random writes?
I'm pretty sure I've hit it at least 4 different ways (so far):
1. Rsyncing the contents of my old file server to my new one. Lots of small random writes to update file system meta-data.
2. Recursive MD5 hash of a large directory tree. With default mount options, Linux ext4 will update the "last accessed" time of a file if the previous time it was touched was more than a day ago. So something which you'd naively expect to be a read-only operation actually generates a bunch of small random writes (again to update file system meta-data). Solution is to specify the "noatime" option when mounting, which does not update access time meta-data if the file is only read from.
3. Indexing service. This was actually how my new server reminded me that I'd forgotten to nuke the indexing tool (mlocate) that Ubuntu installs by default. Indexer tripped over the same issue as #2 as it swept through the file system overnight.
4. Disk formatting. By default, Linux ext4 uses what's called "lazy itable init", which means that when you format a new disk it only initializes the bare minimum of data structures to allow the file system to be mounted. The rest of the initialization is done in the background by a low priority thread after the file system is mounted, or as needed if the file system gets a lot of data written to it shortly after being mounted. This allows initial formatting of a new file system to complete in just a few seconds, and is generally a good thing since it allows you to mount and use the file system immediately. Problem is, the writes generated by the background initialization are small enough and come in to the drive slowly enough that you may get into a situation where each one gets flushed to the media individually, with each one triggering a SMR read-modify-write cycle. If you're trying to access the drive while the background init completes (which can take several days!), you'll see highly variable throughput and long pauses where no data at all is being transferred. Disabling the "lazy itable init" feature during formatting seems to eliminate this issue, with the entire disk being initialized and ready to use in under 15 minutes. (So you have to wait a little longer to be able to mount the new file system, but in return you don't need to suffer through several days of
crappy even crappier performance while the background initialization completes.)
So basically, these drives are just really sensitive to anything that does non-sequential writes for a non-trivial amount of time. We're talking orders of magnitude performance degradation if you hit one of the bad use cases. And if it is thrashing around doing a lot of SMR read-modify-write cycles, read performance is going to suck too because the OS's I/O queue to the drive gets all backed up, and the drive can't service incoming read requests in a timely manner if it is spending most of its time rewriting SMR bands.