Writing · Independent measurement

Why vendor benchmarks are the only benchmarks.

"You agree not to access or use an Offering to analyze, test, characterize, inspect, or monitor its availability, performance, or functionality for competitive purposes."

— Splunk General Terms, Section 1.2(v)

That clause is in your contract today if you run Splunk. Variations of it are in your contract if you run Snowflake, Databricks, Oracle, or most other commercial data platforms. The implication for security architecture decisions is structural, and it explains a pattern most CISOs notice but rarely connect to its cause.

What the clause actually does

The customer can't run the comparison.

Section 1.2(v) of Splunk's General Terms — the agreement every Splunk Enterprise customer is bound by — prohibits using the product to analyze, test, characterize, inspect, or monitor its performance for competitive purposes. The Splunk Software License Agreement adds Section 3(f), which prohibits providing benchmark results to third parties without prior written consent. Both clauses bind. Section 1.2(v) is the broader of the two: it restricts the act of running the comparison, not just the act of publishing the result.

The practical effect: a Splunk customer who wants to evaluate whether ClickHouse, StarRocks, or Trino could run their security analytical workload faster — on their own data, on their own hardware, in their own environment — is contractually prohibited from doing it. Not "discouraged." Prohibited.

This isn't a quirk of Splunk's contract drafting. It's the industry norm for enterprise data platforms. Snowflake's Acceptable Use Policy contains a similar restriction. Databricks' Master Cloud Services Agreement has comparable competitive-testing language. Oracle's Database licensing terms have had a benchmarks clause since the original "DeWitt clause" in the 1980s — named after David DeWitt, the database researcher Oracle prevented from publishing comparative benchmark results. Enterprise data infrastructure has been written this way for forty years.

The structural consequence

Every published benchmark is vendor-funded by design.

When customers can't legally produce comparative benchmarks against their current platform, the only benchmarks that exist are the ones produced by vendors who don't have the bound platform's contract restricting them. A startup competing with Splunk publishes "we run security workloads X× faster than Splunk" because the startup never agreed to Splunk's General Terms. The Splunk customer who could verify or refute the claim with their own data did agree, and is locked out.

Look at the published security-tool benchmark coverage with that lens. Every "X% faster than Splunk" benchmark you've read came from a vendor whose product appears favorably in the result. Every counter-benchmark from Splunk was produced internally and published as a customer-facing study, not as independent verification. The class of benchmark that doesn't exist — and structurally cannot exist under the current contract regime — is the customer-driven head-to-head on real workloads, published openly. The architects who would benefit most from independent measurement are the ones contractually prohibited from producing it.

This produces a recognizable failure mode in architecture decisions. CISOs evaluating SIEM modernization compare vendor-published claims against vendor-published claims. The most quantitatively rigorous number available is whichever vendor most recently funded a benchmark study. Analyst reports synthesize vendor-supplied performance characterizations because they don't have an independent source either. Decisions get made on the layer of measurement that's available, not on the layer that would actually be useful.

What independent measurement requires

Methodology open. Reference implementation gated. Reviewer named.

The instinct to fix vendor-benchmark bias is "open-source the benchmark." Push the methodology to GitHub, push the Docker Compose definitions to GitHub, push the data generators to GitHub, and let the community verify. That instinct is right about the methodology — and wrong about the reference implementation. The methodology should be open. The executable artifact, in a comparison set that includes commercial software with restrictive licensing, can't ship publicly without putting either the publisher or the downloader in contract violation.

What does work: methodology, hardware spec, query suite, and result published openly under a practitioner brand. Reference implementation shared under one-page mutual NDA with engagement prospects and qualifying reviewers. Annual external review by a named practitioner with relevant standing — security data engineer, OCSF contributor, analyst-firm researcher — under NDA, with their signoff published on the public methodology page. That structure preserves the audit trail that makes the result credible without forcing the publisher to violate the licensing terms of the platforms they're characterizing.

The closest analogy is the TPC benchmarks (TPC-H, TPC-DS) and MLPerf. Both publish methodology openly. Neither distributes a turn-key reference implementation that a random customer can clone and run against arbitrary commercial software. Both rely on qualified-implementer audit cycles and independent third-party review. Security data has been waiting for the equivalent layer for the last decade and hasn't gotten it; the contract structure that suppresses it is part of why.

What this means for architecture decisions

Three things to do with this.

One: read your platform's actual contract. The clause language above is published at splunk.com/legal/splunk-general-terms; Snowflake's at snowflake.com/legal; Databricks' at databricks.com/legal. The competitive-testing restriction is in plain language and takes about ten minutes to find. Knowing what you can and can't do with your own platform under your existing license is the precondition for any architecture conversation that involves comparison.

Two: discount vendor-published comparative benchmarks accordingly. "X is 10× faster than your SIEM" is a marketing claim, not a measurement, when the customer can't independently verify it. That doesn't make the underlying performance claim false. It does make the burden of proof shift to whatever independent measurement source you can find — which is small, and getting smaller as the contract regime tightens.

Three: when an independent benchmark is offered, look at how the artifact is structured. Does the methodology actually publish openly? Is the result reproducible by someone who runs the reference implementation under appropriate licensing? Is there a named external reviewer who has audited the methodology under NDA and signed off publicly? Those three questions distinguish independent measurement from vendor marketing dressed up as independent measurement. Most published security-tool benchmark coverage fails at least one of them.

The comparison your team is structurally locked out of.

Independent benchmarks against your real workload, methodology open on the lab, reference implementation under one-page mutual NDA, annual external reviewer named publicly. The layer of measurement the current contract regime makes hard to produce in-house — produced in a way that respects every license involved.

Download methodology (PDF) → See engagements Read the lab page