Economics deep-dive

The cost optimization paradox.

Practitioner surveys rank cost optimization near the bottom of the priority list. Boards rank it near the top. Every security architecture decision gets reverse-engineered to budget anyway. The collision between those two timescales shapes more decisions than either party tends to admit, and it explains why the SIEM cost crisis keeps producing the same conversations even as the tooling around it improves.

Reading time: about 16 minutes. Evidence tier: B overall, with Tier B-C for thought-leader synthesis and Tier D for the Databricks Lakewatch launch claims. The headline practitioner figures come from Joe Reis's 2026 State of Data Engineering Survey (1,101 respondents).

The number nobody expected

Five point two percent.

Joe Reis's 2026 State of Data Engineering Survey asked 1,101 practitioners to name their biggest bottleneck, and compute cost landed near the very bottom of the list at 5.2%, ahead of only tool complexity, and far behind legacy systems and technical debt at 25.4%, a lack of leadership direction at 21.3%, poor upstream requirements at 18.8%, hiring at 11.4%, and data quality at 10.1%. The same survey found that where those practitioners actually spend their time is data modeling and transformation, at 55.4%, with ingestion and pipelines close behind at 48.1%, while compute cost barely registers as a thing they worry about.

2026 State of Data Engineering Survey · n = 1,101

Self-reported biggest bottleneck among 1,101 practitioners in Joe Reis’s 2026 State of Data Engineering Survey. It is a self-selected practitioner survey rather than a controlled sample, so the percentages are directional rather than exact — but the order is unambiguous, and compute cost sits near the bottom at 5.2%.

It's a self-selected practitioner survey rather than a controlled sample, so I read the precise percentages as directional rather than exact, but the direction is unambiguous and it matches what I see on LinkedIn and in conference coverage: cost is not where the people building these systems are spending their attention or their worry. They are worried about the foundation, the data model and the upstream requirements and the legacy debt, and cost sits near the bottom of the list because it is the layer they feel they control least.

Meanwhile, the market the practitioners' employers are sitting inside is screaming the opposite. Redpoint Ventures' 2026 market update reports public software companies down roughly 20% year to date, the worst-performing S&P 500 sector, and public SaaS multiples sit at 4.1x forward revenue, the lowest in a decade. And 54% of enterprises surveyed are actively pursuing vendor consolidation, which is to say they're not "considering" it or "evaluating" it but already doing it.

That's the paradox, because the market is screaming "cut costs" while the practitioners are saying "let me build better models," and both groups think they're correct, which on their own terms they are.

Practitioner view

Why the 5.2% isn't irrational.

The ranking is contextual rather than careless, because each bottleneck the practitioners put above cost reflects a rational priority given how data and security teams have spent the past eighteen months.

Data modeling is where the work actually goes. The survey has practitioners spending 55.4% of their time on modeling and transformation, nearly 90% reporting modeling pain points, and only 5.4% working from a canonical or semantic model, which is the statistical signature of an unsolved foundational problem. The AI rollouts of 2024 and 2025 taught the cohort this the hard way: organizations that rushed into AI-native pipelines discovered the underlying data models were the binding constraint, because the semantic layer was missing, field definitions were inconsistent across systems, and the models hallucinated when the foundation was rotten. Modeling isn't glamorous and it doesn't appear in vendor demos, but it's the thing that determines whether anything downstream works at all.

AI is now the daily tool, not the project. The survey has 82% of practitioners using AI tools daily or more, with 54% reaching for them multiple times a day, and the primary value they report is writing code, the SQL and Python. But ML and AI account for only 10.8% of where their time goes, which is the tell: AI is helping them write the queries faster, not fixing the data model underneath. For a data or security engineer in 2026, fluency with these tools is closer to career survival than to enthusiasm, so it shows up as near-universal adoption without displacing the foundational work.

The bottlenecks they rank highest are organizational, not technical. Legacy systems and technical debt at 25.4%, a lack of leadership direction at 21.3%, and poor upstream requirements at 18.8% are the top three, and none of them is a tooling problem you can buy your way out of. These are the things that block the foundation, and they all sit above every line item a vendor could quote you, which is its own quiet comment on where the real friction lives.

Compute cost ranks 5.2% because practitioners don't experience cost directly. The budget comes from someone else, the contract renewal is someone else's problem, and the board presentation about SaaS spending is the CFO's job rather than the data engineer's. So when a practitioner argues about cost, they're usually arguing about a technical proxy ("this query is inefficient," "this storage tier is wrong") rather than about the line item that the finance team sees. The cost is real, but it reaches them as a constraint rather than as a number they own.

Saying that practitioners "don't care about cost" gets the situation backwards, because they're optimizing for the foundation that everything else depends on, and cost in their lived experience is something that happens to them, a constraint imposed from elsewhere usually after the architecture is half-built. That's a rational basis for ranking it near the bottom.

Market view

Why the market forces cost anyway.

The practitioners' priorities do not exist in isolation. They exist inside organizations that exist inside a software market that, in 2026, looks structurally different than it did in 2021 or 2022.

The SaaS repricing. Public software companies have lost roughly 20% of their value year to date, which looks less like a correction than a repricing of the entire sector, so when software multiples drop to 4.1x forward revenue (the lowest in a decade), most CIOs eventually receive a memo from finance that opens with "re-evaluate." This is a software-spend-broadly dynamic rather than anything specific to Splunk or to SIEM, and it captures every line item with a subscription attached.

The consolidation mandate. 54% of enterprises are actively consolidating vendors. That means contract renegotiations, platform migrations, and the elimination of overlapping tools. If you're running Splunk plus a data lake plus a SIEM plus a detection platform plus an XDR plus a DLP product, someone in finance is calculating the overlap. The conversation may not have reached your team yet, but the calculation is happening.

Databricks' March 2026 launch of Lakewatch (an agentic SIEM built on Delta Lake and Unity Catalog) accelerates this calculus. The launch messaging leads with an 80% TCO reduction claim relative to incumbent SIEMs. I want to be careful here: this is Tier D evidence (vendor launch claims, no independent production validation), and the 80% figure may not hold up across the workloads it gets applied to. What matters here isn't whether the number is right but that the launch is pitched at CISOs and CFOs rather than at the practitioners building the systems, because Databricks knows that the practitioner put cost near the bottom of their list while the executive put it at the top.

The hidden cost of fragmentation. Benjamin Rogojan documented a pattern in early 2026: "dashboard graveyards," where self-service BI created dozens of unmaintained dashboards, each built on semi-clean data from different sources. The data team moved fast, but nobody cleaned up. Every abandoned dashboard is a cost that doesn't appear in a line item but consumes compute, storage, and (most expensively) analyst attention when someone has to answer the "which number is right" question for a board deck. I treat this as Tier C evidence (practitioner observation, multiple LinkedIn captures), but the pattern matches what I see in the SOC analytics environments I work in. The security version is dashboard graveyards plus detection graveyards plus integration graveyards, all running quietly and all burning budget that nobody is auditing.

The market doesn't ask whether the practitioner thinks cost matters, because it produces a renegotiation cycle, a consolidation wave, and a fragmentation tax regardless, and someone has to absorb them, which is usually the architecture that was designed without those forces in mind.

Where the paradox resolves

Different timescales, different organizational layers.

The paradox resolves when you stop reading the two views as competing and start reading them as operating on different timescales and different layers of the organization. They're not arguing about the same problem at the same moment.

Practitioners optimize for capability on a Q1-to-Q2 horizon, at the individual and team level. Better data models produce better AI outputs, which produce better detection outcomes. AI-native pipelines produce career relevance, which produces job security. Governance produces compliance, which produces "we didn't get fined this quarter." These are real outcomes, on the timescale the practitioner controls.

Organizations optimize for cost on a Q3-to-Q4 horizon, at the leadership and finance layer. Vendor consolidation produces contract savings, which appear in board-level metrics. Platform elimination produces operational simplification, which produces headcount efficiency. SaaS repricing produces renegotiation leverage, which produces margin improvement. These are also real outcomes, on the timescale finance controls.

The collision happens at the seam. Finance's consolidation mandate eliminates the platform the practitioner just finished modeling on. The 54% consolidation wave forces a migration that nobody on the data team prioritized. The architecture that was perfectly sized for the capability horizon turns out to be expensive enough that the cost horizon eats it. And the practitioner, who legitimately ranked cost near the bottom, ends up rebuilding the work, sometimes with the same tools under a different vendor's wrapper, sometimes by porting to an open format that the next consolidation cycle can't easily eat.

Neither side is wrong about its priorities, but both are wrong about the timing, because the capability horizon and the cost horizon run on different clocks and the architecture has to survive both of them.

Security architecture implications

What this means for the security data stack.

If you're building or evolving a security data platform in 2026, the paradox has three concrete implications. None of them require taking sides between the practitioner ranking and the executive ranking. They require designing for both clocks at once.

1. Architecture must serve two masters.

Design for capability and cost from day one. Open formats and open engines do this naturally: Iceberg or Delta at the table layer, Parquet on disk, OCSF at the schema layer, Sigma at the detection layer, with ClickHouse, DuckDB, or Spark as engines you can swap. That stack gives the practitioner the modeling flexibility and analyst-accessibility they actually want, and it gives finance the portability they will eventually demand when the consolidation memo arrives. One caveat I'd add from my own lab, because it's the kind of thing that doesn't surface until you check for it: those engines are interchangeable for the SQL and roughly for the latency, but not unconditionally for the answers — I had one of them return a filtered count tens of rows short of the others over the identical Parquet with no error raised, so the swap is only as safe as the cross-engine answer-equality check you run behind it (that case is its own essay).

Proprietary schemas (CIM, ECS) and vendor-locked storage (Splunk Cloud, Sentinel-managed storage) optimize for one master only, usually the master who signed the original contract. When the other master arrives with a budget cut, the architecture is stuck. The cost of unwinding it tends to be larger than the cost of building it twice the right way the first time. I cover the migration math in detail in the hidden cost of SIEM migration, which lays out why "we'll just migrate later" rarely survives contact with the pipeline-rewrite bill.

2. Cost savings are a side effect, not the goal.

Don't lead with "we'll save 50 to 80%," because practitioners put compute cost at 5.2% and won't be moved by it, so lead instead with better data modeling, better detection, better governance, and a better analyst experience. The cost savings follow naturally from the architecture choices (open formats, commodity storage, decoupled compute, no per-GB SIEM ingestion tax), and when the CFO asks about cost six months later, the numbers will be there. But the architecture was built for capability, not for cost, which means it actually works, unlike the "cost-optimized" architectures that cut corners on data quality and discover three quarters later that the detection engineer can't write a working rule against the data shape the cost-optimization decision produced.

This is also the honest framing of why the SIEM cost crisis is genuinely a crisis without it being a vendor-bashing story. Splunk, QRadar, Sentinel, and the rest priced themselves for an era when security data volume was measured in gigabytes per day. The volumes are now measured in terabytes, and the pricing models did not bend gracefully, which is a problem of pricing-model drag rather than a problem of bad vendors. The fix isn't to call vendors greedy but to design an architecture where the ingestion volume and the analyst-facing query volume don't have to share a pricing surface.

3. Build for the consolidation wave.

The 54% consolidation figure isn't a one-time event but a structural shift that will repeat, so design your architecture assuming at least one of your current vendors will be eliminated, acquired, or repriced within 18 months. The list below is not a checklist for ideological openness; it's a checklist for surviving a consolidation cycle without rebuilding from scratch:

Schema portability. OCSF, not CIM. If the schema is owned by a single vendor, the consolidation cycle can take it from you.
Storage independence. Iceberg on S3 (or Delta on S3), not vendor-managed storage that you can't read without the vendor's compute. See Iceberg vs Delta Lake for security data for the table-format trade-offs.
Query engine flexibility. ClickHouse or DuckDB or Trino or Spark, but not locked to one. If swapping the engine is a Python-library change rather than a data-migration change, you've designed for the consolidation horizon. I put a number under that swap in the lab: DuckDB and embedded ClickHouse returned identical answers to the same OCSF queries, so the SQL and the results port, and the latency cost of moving between them was modest and tracked the query shape rather than amounting to a rewrite.
Detection logic portability. Sigma rules, not SPL. If your detection logic is written in a vendor's proprietary query language, the consolidation cycle takes your detection content too.
Pipeline portability. The ingestion layer is where vendor lock-in tends to be most invisible. I cover the specifics in pipeline lock-in: the short version is that pipeline tooling deserves the same portability scrutiny as storage and compute.

None of these moves require believing that any specific vendor will fail or be acquired. They require believing that consolidation cycles happen, that they affect roughly half of enterprises in a given window, and that the cost of preparing for them is smaller than the cost of being surprised by them. The 54% figure is the operative one. Build to survive being on the wrong side of it.

The vendor messaging mirror

Why every launch deck leads with cost.

Once you see the timescale split, the vendor messaging patterns make more sense, because nearly every SIEM-adjacent launch in the past eighteen months has led with cost. Databricks Lakewatch leads with 80% TCO reduction, Cribl leads with "reduce SIEM ingest costs by 50%," and various data-lakehouse pitches lead with "10x cheaper than your incumbent SIEM." The headlines are calibrated to the executive horizon rather than the practitioner horizon, because the executive is the one signing the renewal, and the executive is the one whose ranking puts cost first.

That doesn't mean the cost claims are wrong. Some of them are well-supported, particularly when the comparison is apples-to-apples (ingest-tier-to-ingest-tier, query-tier-to-query-tier, retention-tier- to-retention-tier). Many of them are not, particularly when the comparison is "list price of vendor A" against "discounted enterprise agreement of vendor B," or when the calculation excludes the migration cost, the retraining cost, and the detection-content port. I treat any vendor-published TCO figure as Tier D evidence until I see a methodology I can audit. Lakewatch's 80% claim, Cribl's 50% claim, and the various lakehouse vendors' multiples all fall in that bucket today.

The useful question to ask a vendor with a cost-led pitch isn't "is 80% real?" but rather "what does the 80% become when you add the migration project, the parallel-running window, the detection-content port, the analyst retraining, and the first contract renewal?" If the answer is still a real improvement, the architecture decision may be worth pursuing on cost grounds. If the answer is "still cheaper, but only by 10 to 20% over three years," you're back to choosing on capability, governance, and portability, which is where the practitioner ranking pointed in the first place.

The practitioner's hedge

One thing to add to the priority list.

If you're a data engineer or security architect reading this and thinking "I don't care about cost, I care about getting the modeling and the governance right": you're not wrong. Those are the things that determine whether the system actually works. The practitioners who put cost near the bottom are correctly identifying that the system has to function before it can be cheap.

But add one item to the list, which is portability, and not because it saves money today or makes the dashboards prettier, but because when the consolidation mandate arrives (and the 54% number says it's already arriving for half the room) you want your work to survive the platform change. Portability is the thing that lets the capability investment compound across consolidation cycles instead of being rebuilt every eighteen months.

The portability moves are mostly free at design time and expensive at retrofit time. Choosing OCSF over CIM costs roughly the same engineering hours either way at greenfield, and saves a schema-rewrite project later. Writing detection logic in Sigma costs roughly the same as writing it in SPL at greenfield, and saves a detection-content port later. Choosing Iceberg or Delta over a proprietary table format costs roughly the same at greenfield, and saves a petabyte-reprocessing bill later. The "build it twice" cost only shows up if portability wasn't a first-class concern at design time, which is the default outcome when cost is ranked near the bottom.

The smart move isn't to flip the ranking but to add portability quietly, treating it as a property of the capability-first design rather than a separate cost-first design, and letting the eventual consolidation-cycle savings happen as a side effect. The practitioners who do this are right about their priorities and prepared for someone else's priorities at the same time.

The architect's hedge

One thing to add to the board narrative.

The mirror move applies to the executive layer. If you're the CISO or the security architect carrying the cost narrative to the board, the temptation is to lead with the 80% TCO reduction claim that Databricks or another vendor handed you, and you should resist it, because the vendor numbers are Tier D evidence, the comparison is rarely apples-to-apples, and the credibility hit when the actual savings come in at 30% is larger than the credibility win when they come in at 80%.

Lead with the capability and governance story (better detection coverage, faster analyst workflows, compliance defensibility) and treat the cost savings as a documented secondary outcome. Tell the board "we expect 30 to 50% cost reduction over three years based on our pilot data, with the larger savings dependent on the migration project completing on schedule." That's defensible. The board doesn't need 80% to approve the architecture. They need the capability story to make sense and the cost story to be honest.

The pattern across both audiences is the same, which is that you lead with what the audience cares about most while including the other side as a hedge, so practitioners get capability with portability layered in and executives get cost with capability and governance layered in. The architecture that survives both clocks is the one that doesn't pretend either clock is the only one running.

Conclusion

Cost optimization is someone else's priority. Plan accordingly.

The practitioners who put compute cost at 5.2% are not careless about money. They're accurately reporting that cost is not the layer of the system they control. Their priorities (data modeling, AI-native pipelines, governance) are the priorities that determine whether the system works in the first place. Without those, no amount of cost optimization produces a defensible security program. The capability layer is the binding constraint, and they correctly identified it.

The market doesn't care about that ranking. The SaaS repricing, the consolidation mandate, and the fragmentation tax produce a cost optimization wave whether the practitioner asked for one or not. 54% of enterprises are pursuing it actively today, and the cycle repeats. The architecture either survives the wave or gets rebuilt under it.

The resolution isn't to flip the practitioner ranking. The resolution is to design capability-first architectures that happen to be portable (open table formats, open schemas, open detection languages, swappable query engines) so the consolidation cycle takes a contract instead of an architecture. Cost optimization, on that design, is a side effect of having built the right thing, which is the version the practitioner ranking implied all along, even if the survey didn't ask the question that way.

The honest framing of the SIEM cost crisis is that it's a pricing-model-drag problem layered onto a consolidation cycle, not a story about bad vendors or careless practitioners. The crisis resolves when the architecture stops asking the ingestion layer and the analyst layer to share a pricing surface, and starts treating cost optimization as a property of portability rather than a property of vendor selection. That's a multi-year project, and it survives whichever way the next consolidation wave breaks.