Economics deep-dive

The write endurance security data never spends.

For a security data platform at retention scale, the dominant line item is not compute, it is drive media, which means the choice between a read-intensive, a mixed-use, and a write-intensive NVMe drive is one of the larger cost levers in the whole architecture and one of the least examined. I want to make the contrarian case directly: the mixed-use and write-intensive tiers are sold at a premium that a security telemetry workload almost never consumes, because security ingest is append-only, sequential, and written once to be read rarely, and the part of the industry selling you the premium has stopped publishing the data that would let you prove you do not need it.

Reading time: about 16 minutes. Evidence runs from Tier A (JEDEC and SNIA endurance definitions, vendor spec sheets, an Intel write-amplification white paper) through Tier B (ServeTheHome's used-drive field study, Kingston citing Forward Insights, multi-source price trackers) to Tier C (distributor pricing and the analyst premium estimates, flagged inline), with one claim I will not pretend is settled, the reliability equivalence of the tiers at matched real-world load, left as Tier D and named as the open question it is rather than laundered into a finding.

The bill is mostly drives

Why the media decision is the cost decision.

A security data platform is, at the scale where this argument matters, mostly a place to keep a very large amount of telemetry for a very long time. Compliance retention runs years, not weeks, and the volume that has to land and stay landed dwarfs the slice that is hot enough to query in real time. When you draw the cost picture for that kind of system, the box that dominates is storage, and inside that box the dominant number is the price of the media the bytes physically sit on. Compute matters, network matters, the licensing of whatever engine runs the searches matters, but at retention scale the drives are where the money goes, so a decision that changes the per-terabyte media cost by tens of percent changes the whole program's economics more than almost anything you can do at the query layer.

That is the lens the rest of this essay uses, and it is why I think the endurance tier is worth a long look that it almost never gets. When a team specs the NVMe for a security lake, the endurance class (read-intensive, mixed-use, or write-intensive) rides along as a checkbox in the configurator, and the higher tiers cost meaningfully more per terabyte. If the workload genuinely needed that endurance the premium would be a fair price for reliability. My claim is that security telemetry does not need it, that the gap between what the spec sheet assumes about your write pattern and what your write pattern actually is happens to be enormous, and that paying for the higher tier on a security lake is paying for a property of the drive you will retire before you ever use.

I will hold the strong version of this where the evidence is strong and flag the two places it thins out, because the honest version of a contrarian claim is more useful than the loud one. The strong leg is the workload physics and the field data on how much endurance real deployments actually burn. The thinner legs are the current size of the price premium, which the vendors have made harder to measure than it used to be, and the question of whether a read-intensive drive is genuinely as durable as a write-intensive one at the same real load, which nobody has published a clean answer to. The thin legs do not rescue the premium; they are the part of the picture the industry has chosen to keep dark, and I think a buyer is entitled to read that choice as information rather than ignore it.

The media ladder

HDD, SATA SSD, NVMe, and why the gap is widening, not closing.

Start with the three rungs and the honest numbers, because the cost case only makes sense once the performance difference is on the table. A nearline enterprise hard drive in the 18-to-22 TB class lands near $11 to $14 per terabyte as of early 2026, delivers a couple hundred 4K random IOPS, sequential throughput in the low hundreds of megabytes per second, and an access latency measured in single-digit milliseconds (HDDHunt drive-price tracker, April 2026, Tier B; the performance envelope is the unchanged physics of a spinning platter). An enterprise NVMe SSD on PCIe Gen4 (the Samsung PM9A3 or Micron 7450 are the reference parts) reads in the neighborhood of 6,800 MB/s, clears a million random read IOPS, and answers in tens of microseconds, with Gen5 parts like the Samsung PM1743 roughly doubling the sequential throughput to about 14,000 MB/s (vendor spec sheets, Tier A; StorageReview's independent PM1743 testing, Tier B). The latency gap is the headline: microseconds against milliseconds is two to three orders of magnitude, and that is the property that makes NVMe a genuine paradigm shift for a data platform rather than a faster disk.

Media class	$/TB (early 2026)	4K random IOPS	Latency	Seq. throughput
Nearline HDD 18-22TB	~$11-14	~100-300	~8-10 ms	~200-285 MB/s
Enterprise NVMe Gen4	~$150-300*	~1,000,000 read	~10-100 us	~6,800 MB/s read
Enterprise NVMe Gen5	~$150-300*	~2,500,000 read	~10-100 us	~14,000 MB/s read

* NVMe $/TB is unusually volatile in this window; see the NAND-shock note below.
Sources: vendor spec sheets (Tier A); HDDHunt + price trackers (Tier B).

Why NVMe is the paradigm worth paying for at all is that the microsecond latency and the million-IOPS random read collapse a distinction security architects have organized around for years, the hard line between a hot tier you can query interactively and a cold tier you can only batch-scan. On NVMe the cold tier answers fast enough to hunt across, which is exactly why the lakehouse pattern became plausible for security at all. The catch is the price. NVMe at $150 to $300 per terabyte against HDD at $11 to $14 is, even on a calm day, a large multiple, and this has not been a calm day. A NAND-flash supply shortage running through late 2025 and into 2026 pushed enterprise NVMe pricing sharply upward (the Solidigm D5-P5336 at 122.88 TB went from roughly $101/TB at its 2025 launch to about $302/TB by early 2026, per TechRadar's reporting and CDW listings, Tier B), which widened the HDD-to-NVMe per-terabyte gap from something like six times to something like sixteen times over nine months (dropreference price analysis, January 2026, Tier C, and it reads with an SSD-adoption slant, so treat the exact multiple as directional). The practitioner point is the inversion of the usual story: the gap is widening, not narrowing, so the comfortable assumption that flash is about to cross under disk is, for this window, simply wrong.

That widening gap is the backdrop that makes the endurance question urgent rather than academic. When NVMe is already commanding a sixteen-times premium over disk for the capacity itself, every additional dollar you spend buying up the endurance tier on that flash is a dollar spent on top of the most expensive media in the building, on a property of the drive you are about to discover the security lake does not exercise.

What endurance actually is

The tiers, the premium, and the SKU sleight of hand.

Endurance on an enterprise SSD is sold in three rough tiers, and the unit is drive-writes-per-day, the number of times you could overwrite the drive's full capacity every day for the warranty period (typically five years) before the flash is rated to wear out. Read-intensive drives are rated near 1 DWPD, mixed-use near 3 DWPD, and write-intensive near 10 DWPD, and the higher tiers buy that rated endurance mostly by setting aside more of the raw flash as over-provisioning, which is why a write-intensive drive ships less usable capacity from the same silicon (the tier definitions track JEDEC and SNIA endurance methodology and the vendor spec sheets, Tier A; the Micron 7450 PRO at 1 DWPD versus the 7450 MAX at 3 DWPD, and Samsung's PM9A3 at 1 DWPD with about 3,504 TBW at 1.92 TB, are the concrete reference parts). The translation into guaranteed bytes is the more honest framing: a 1 DWPD rating on a 15 TB drive is on the order of 27 petabytes written over five years before the warranty endurance is spent, which is already a staggering amount of writing for a workload to do.

The premium for moving up the tiers has historically run somewhere around 40 to 80 percent at equal capacity, write-intensive over read-intensive (ATP and MarketSignalIndex endurance guides, Tier B and C), and here is where I have to be honest about the evidence rather than wave the number around: that range is analyst-derived and increasingly dated, because the vendors have made it genuinely hard to verify by restructuring how they sell the parts. The clean comparison, the same vendor's read-intensive and write-intensive drive at the same capacity and generation with both prices on a public page, has largely disappeared from the 2025-26 market. Vendors moved to differentiate on generation instead (Gen4 against Gen5) and on form factor (U.2 against E1.S against M.2), Samsung has been winding down explicit write-intensive Gen4 parts in favor of performance-positioned Gen5, and enterprise drives increasingly move through contract and configurator channels that never expose a list price at all. So the 40-to-80 figure is the best public anchor available and it is soft, and I would rather tell you it is soft than dress it up.

I think the disappearance of the clean comparison is worth sitting with, because it is convenient in a specific direction. The harder it is to put a read-intensive and a write-intensive drive side by side at matched capacity and see the price delta, the harder it is for a buyer to ask the obvious question, which is whether the workload in front of them justifies the up-tier at all. A market that wanted you to make that comparison would make it easy. The configurator that bundles the endurance class into a server SKU and quotes you one number is not, I think, structured to help you discover that you are buying endurance you will not use.

The workload physics

Security telemetry doesn't write the way the spec sheet assumes.

The DWPD rating was built around a pessimistic write pattern, small random 4K writes hammering the same cells, the kind of pattern an OLTP database or a write-heavy metadata store produces, and that pattern drives write amplification, where the drive's own garbage collection has to rewrite more flash than the host actually sent. Security telemetry is close to the opposite pattern. Ingest is append-only and sequential: events arrive, get written forward in large ordered batches, and are essentially never updated in place. The data is written once and read rarely, since the overwhelming majority of telemetry is retained for compliance and hunting and is touched, if at all, long after it lands. Sequential append is the friendliest pattern flash has, so the write amplification a security lake actually produces sits close to 1.0 and well under the 2-to-3 range that 4K-random workloads hit (Intel's data-center SSD white paper and the broader write-amplification literature, Tier A; Microsoft's Storage Spaces Direct guidance to add only 20 to 30 percent headroom over measured peak writes, Tier A). The spec sheet is pricing a risk your workload does not run.

Put real numbers on it. Elastic's own cluster-sizing guidance notes that a 15 TB drive rated at just 0.1 DWPD already supplies roughly 3 TB per day of sustained write capacity, about 35 MB/s of continuous ingest (Elastic sizing documentation, Tier A and B), and that is at one-tenth of the read-intensive rating. Now size a mid-enterprise security lake against that headroom. Take an estate ingesting on the order of 5 TB per day of raw telemetry, which is a real mid-size number (the German multinational in the federated-search literature was cutting 5 to 7 TB/day before pipeline reduction). Observability and security data compresses roughly 10 to 20 times on the way to columnar storage (Elastic and Coralogix benchmarks, Tier B), so the bytes that actually land are closer to 0.5 TB per day, and even before you spread that across a multi-drive hot tier, even if you double it for compaction and merge rewrites, you are nowhere near the multi-terabyte daily write budget a single read-intensive drive carries. The realistic per-drive duty on a security lake's flash tier lands at a small fraction of 1 DWPD, and you would have to construct a deliberately pathological configuration to push it past the read-intensive rating, let alone toward mixed-use or write-intensive territory.

The field data says this is not just my arithmetic, it is what actually happens in data centers. ServeTheHome bought 1,347 used enterprise SSDs out of data-center liquidation and analyzed how much of their rated endurance the prior owners had actually consumed, and the finding was that organizations routinely and substantially over-buy endurance, retiring drives with the vast majority of their rated writes unspent (the ServeTheHome 1,347-drive study, 2024-25, Tier B, and the detailed per-drive figures are theirs and not fully public, so I am citing the conclusion they drew, not a number I can reproduce). Kingston, citing Forward Insights, puts roughly 82 percent of data centers as running successfully under 1 DWPD (Tier A to B, vendor-published but third-party-sourced, and old enough that I would treat the exact percentage as directional). Two independent reads, one from a pile of dead drives and one from a market survey, land in the same place: most deployments never spend the endurance they paid for, and a security lake is on the gentle end of even that gentle distribution.

The number nobody publishes

The missing data is the argument.

Here is the place a careful skeptic should push back, and I want to put the pushback in myself rather than let it sit unspoken. The over-buy evidence shows deployments do not consume their rated endurance, but it does not, on its own, prove that a read-intensive drive is as reliable as a write-intensive one at the same real-world load. Endurance ratings are a warranty floor, not a cliff, and it is at least possible that a read-intensive drive run near its rating fails meaningfully sooner than a write-intensive drive run at the same write volume. The honest state of the evidence is that nobody has published a clean answer. There is no Backblaze-style field study slicing SSD failure rates by rated endurance tier at matched actual usage, the way we have for spinning disks, so the reliability-equivalence claim is Tier D, an open question, and I am labeling it as one rather than asserting it.

But notice the shape of what is missing, because the shape is the argument. To justify the endurance premium for a security lake you would need two things on the table: a transparent comparison of tier prices at matched capacity, and field data showing the cheaper tier fails sooner at the loads security telemetry actually produces. Both are absent from the public record, and they are absent in a direction that happens to protect the premium. The vendors who could publish the tier-by-tier reliability data are the vendors who sell the more expensive tier, and they have not. The same vendors restructured the catalog so the price comparison is hard to run. I cannot prove the premium is unjustified, because the data that would settle it has not been produced, but a buyer being asked to pay 40-to-80 percent more for endurance is entitled to ask why the party collecting that premium has left exactly the two measurements that would validate it unpublished. The burden of proof sits with the more expensive default, and it has not been met.

This is the gap I would rather close than complain about, and it is squarely within reach of a lab. The measurement that settles the workload side is a first-party DWPD-consumed benchmark: run a real Zeek and OCSF ingest pipeline against an instrumented drive for weeks, count the bytes the host writes plus the amplification the drive adds through compaction and garbage collection, divide by capacity and days, and report the realized DWPD against the read-intensive rating with the headroom shown. That is on my lab roadmap precisely because it converts this whole argument from a Tier B and C reading of other people's numbers into a Tier A number of my own, and because if the realized DWPD comes in higher than I expect, the right thing to do is publish that and retract the strong claim. The hypothesis is falsifiable, the test is cheap, and nobody selling the more expensive endurance tier has run it.

One honest update on that roadmap, because the constraint turned out to be the measurement environment rather than the workload. My first attempt to run it hit a wall: the benchmark host is a Windows machine running the pipeline under WSL2, and WSL2 doesn't pass the physical NVMe device through, so it exposes only virtual disks with no real SMART data and smartctl can't read the drive's Data-Units-Written counter that the whole measurement depends on. The guest-level write counters give a lower bound on what the host writes, but they can't see the amplification the drive itself adds, which is exactly the number that decides whether realized DWPD approaches the rating. So the device-measured version is parked, not abandoned, waiting on a native-Linux boot or a cloud instance where smartctl -A reads true Data-Units-Written on an isolated drive. The claim stays at Tier B and C until that runs, which is the honest place for it.

Where I land

Read-intensive is the right default, and what would change my mind.

The architecture conclusion is direct. For the bulk storage of a security lake, the flash that holds normalized events for hunting and retention, the read-intensive endurance tier is the right default, paired with an HDD or object cold tier underneath it where the cost-per-terabyte advantage of disk is still a clean sixteen-times win and the latency does not matter. There is a genuinely write-bound role in a security platform, and it is worth naming so the argument does not overreach: the hot ingest buffer, the streaming tier where Kafka or an equivalent absorbs the firehose before it is batched and compressed downstream, and the metadata and RAID-journal volumes that do take small random writes. That tier is small, and it is the one place a mixed-use or write-intensive drive earns its premium. The mistake is letting the write intensity of that thin front edge set the endurance class for the entire lake behind it, which is the petabyte where the money actually is and where the write pattern is gentle.

I will state the confidence honestly, because the brand this argument is published under is empirical honesty and not contrarian noise. The over-specification claim, that a security lake's flash burns a small fraction of even the read-intensive rating, I hold with reasonable confidence, because the workload physics and two independent field reads all point the same way. The claim that the premium is therefore pure waste I hold less firmly, because the current size of that premium is harder to pin than it should be and the reliability-equivalence question is genuinely open. Both of those soft spots, though, are soft because of data the industry has not published, not because of data that cuts against me, and I am not willing to let an absence the seller created argue for the seller.

Two things would move me. A clean field study showing read-intensive drives failing materially sooner than write-intensive drives at the loads a security lake actually produces would justify the premium, and I would revise. So would my own DWPD-consumed benchmark coming back with realized write rates that approach the read-intensive rating, which would mean I have the workload physics wrong, and I would publish that and say so. Short of either, the read-intensive default stands, and the larger point stands with it: the storage media decision is the cost decision for a security data platform, which is exactly why getting it right matters, and why the next essay takes the same lever to the question everyone is actually asking, whether this whole estate belongs in the cloud at all.