Program POV
Use data to challenge what we have.
Most security data programs trust their vendors, their schemas, and their own past assumptions. This practice doesn't. The data platform has to earn trust empirically — source by source, claim by claim, query by query — and do it continuously, not at procurement time.
The three pillars
Three commitments. Each one is testable.
The pillars aren't aspirational language. They're structural commitments — each defines a property the data platform has to demonstrate, with evidence the program produces and updates over time.
Trustworthy
The data is instrumented, validated, and lineage-traceable. Completeness, freshness, and schema conformance are measured per source. Failures surface before analysts notice them in their queries.
What this catches: the source feed that has been silently dropping events for three days while no dashboard alerts on it. The schema drift that quietly makes half a detection rule's references resolve to nothing. The asset database that is authoritative for ownership in 80% of cases — except for the 20% where a more current source disagrees and no one has reconciled the two.
The evidence: per-source data health reports — completeness, freshness, schema conformance, OCSF conformance (OCSF is the open standard for security data schemas; conformance means a given source's events match the shared shape rather than the vendor's bespoke format) — running continuously. The artifact isn't "the vendor told us their pipeline is reliable." It's "we measured it over the last 30 days, and here is what we saw."
Well-connected
Entities — assets, users, applications, configurations — resolve cleanly across sources. The data catalog knows which source is authoritative for which attribute, with confidence and freshness scoring. Joins do what their JOIN clauses claim they do.
The failure mode: an analyst writes a hunt joining endpoint telemetry to the asset database and gets wrong results because the asset identifier field means three different things to three different tools. Or the CMDB says 50,000 assets, the EDR sees 47,000, the vulnerability scanner sees 52,000 — and the delta is quietly papered over rather than named.
What evidence: cross-tool gap analysis. When the sources disagree, the delta is documented, the authoritative source per attribute is determined with confidence and freshness scoring, and the coverage holes are explicit. Without that cross-tool view, "well-connected" is an assertion, not a property.
Performant
The data platform has to meet two distinct latency regimes on the same data: sub-second detection and response (operational), and petabyte-scale historical threat hunting (analytical). The single-engine assumption — one tool, one platform, accept the compromise on either side — is the regression that has been getting unbundled across the industry since 2022.
Most performance disappointment in security data isn't the vendor missing the spec. It's the spec being measured on a workload that doesn't match production. The brochure benchmark uses synthetic queries and idealized data shapes; the actual workload is dirty, skewed, and 40% of it is the same handful of queries running on different time ranges.
The evidence: reproducible benchmarks against the actual workload, not against the brochure. On a 10-million-event Zeek workload, identical hardware, ClickHouse ran 145× faster than the dominant schema-on-read SIEM with 8.2× compression — methodology in the lab; reference implementation under NDA.
The method
Empirical skepticism. Evidence tiers. Update on contact with new data.
Every claim I put in a recommendation gets graded by what kind of evidence it rests on. Four tiers:
- Tier A — production-deployment evidence. Someone running the system at scale and reporting the metrics. Netflix on Iceberg, Insider's S3 cost reduction, the Zeek analytical-workload benchmark itself.
- Tier B — peer-reviewed research, analyst reports from Gartner / Forrester / IDC, or expert consensus across multiple independent sources.
- Tier C — expert opinion or framework inclusion. Useful but partial; usually a triangulation ingredient, rarely the load-bearing source.
- Tier D — vendor marketing. Excluded by default unless corroborated by Tier A or B evidence.
The update rule: positions evolve when new evidence overturns them. That update history is kept publicly on /research as a "things I changed my mind on" log — twenty-two contradictions documented and counting. The willingness to be wrong on the record is the credibility move; programs that never update their priors are programs that aren't actually testing them.
The full benchmark methodology, results, and reproducibility statement live on the lab page. Anyone can re-run it on their own workload.
If a future benchmark on a different workload reverses the result, I update. Vendor neutrality is a consequence of empirical skepticism, not the goal.
What I argue against
Four positions I reject.
Vendor benchmarks taken at face value.
Vendors publish benchmarks against the workloads where their architecture is strongest. The numbers are real; the workload is selected. The fix is independent benchmarks against the workload that actually matters to your environment, with methodology in the open so the result is contestable.
Schema-at-read everywhere.
The decade where schema-on-read inverted-index architectures (where the system parses raw events at query time rather than at ingest) were the right answer has ended for most security workloads. Structured data on columnar storage at read time wins by orders of magnitude on most security queries — and the evidence has been mounting since 2022. Some workloads still warrant schema-on-read; most don't.
"Trust the platform."
The platform is a contract: you give it data, it gives you query and alert. That contract is testable. Most security data programs never test it — they take the vendor's word for completeness, freshness, schema, and performance, and they pay for that trust in analyst time and missed detections. The cost is real; it just doesn't show up as a line item on the SIEM invoice.
Staff-aug consulting.
A senior practitioner billed by the hour to extend a vendor's services arm isn't a fair-broker engagement. The work product is whatever the vendor wanted shipped; the consultant's incentive is to keep the engagement running, not to ship the most evidence-driven recommendation. Fixed-price, scoped, and evidence-gated is the alternative I run on.
Foundation + three projects
The validation program is the gate. The three projects are optional, mix-and-match.
The foundation: data health, broadly defined.
A two-stage sequence that gates everything downstream. You don't ship detection content, hunting workflows, or machine-learning models on top of a foundation you haven't validated.
Stage one is per-source data health: continuous reports establishing that what's in the lakehouse matches what each source promised — completeness, freshness, schema conformance, OCSF conformance.
Stage two is cross-tool gap analysis: when the asset database, the EDR, and the vulnerability scanner disagree about what's on the network, the delta is named, the authoritative source per attribute is determined, and the coverage holes are explicit. The cross-tool view is where assurance lives. A program that produces clean per-source reports but never reconciles across tools cannot defend the claim that its connected dataset is complete.
Read the foundation deep-dive →Project 1
MOAR — Modular Open Architecture data infrastructure.
The data platform itself: an Iceberg-based lakehouse, a vendor-neutral catalog (Polaris, Nessie, or Hive Metastore), best-of-breed query engines selected against the workload (ClickHouse, Trino, Dremio, or StarRocks), vendor-neutral routing (Tenzir, Vector, or Cribl), and the assembly that lets the same data serve sub-second detection and petabyte-scale hunting.
When to pick it: SIEM cost spiral, retention pain, multi-region or regulated query needs, or any infrastructure decision where vendor neutrality and open formats matter more than path-of-least-resistance.
Read the MOAR project →Project 2
DetectFlow — detection at thousands-of-rules scale, without the operational debt.
Detection-as-code, CI/CD pipelines for detection content, telemetry feedback loops, automated regression testing. Each rule is a versioned, tested, deployable artifact whose performance and false-positive rate are continuously measured. The differentiator is scale: most detection programs cap out at hundreds of rules because of operational debt; DetectFlow is the discipline that lets a SOC carry thousands without collapsing under maintenance load.
When to pick it: detection backlog growing faster than the team can maintain; analyst time consumed by tuning rather than hunting; incident retros surfacing detections that should have fired and didn't.
Read the DetectFlow project →Project 3 — research territory
MLOps-enabled model threat hunting.
Models surface anomalies and prioritize hunts; MLOps manages the model lifecycle — training, drift detection, retraining, evaluation — the same way DetectFlow manages the detection lifecycle. The point is not "AI in the SOC." It's treating models as production artifacts with the discipline that detection content has earned.
Maturity: leading-edge. No incumbent doing this well at petabyte scale. Near-term posture is thought-leadership track first; service line follows once the foundation work has produced reference clients.
Read the MLOps-hunting project →The thesis is the program. The research is the receipts.
Active hypotheses, the contradictions log, and the method in practice — the open notebook, in public.