Run it yourself

Catching the mistake that kills a detection.

If you've read the flagship piece you already know the failure I'm chasing: a detection that compiles clean, deploys without complaint, and then quietly matches nothing for months, because a field it keys on was mapped somewhere upstream to a path that doesn't exist in the data it actually runs against. No error, no exception, just zero results that look exactly like a quiet network. What I want to do here is narrower and more useful, which is show you the check that catches that class of mistake, why nothing in the off-the-shelf stack catches it for you today, and how you'd run the thing yourself against your own mappings.

This is the practitioner version, so I'm going to spend less time on why it matters and more on the mechanism and the numbers, because the mechanism is simpler than it sounds once you see what's actually going wrong. Evidence tier: B, my own groundings and a single corpus, with the limits stated where they bite.

Grounding method · deterministic check

Each mapping is grounded twice and independently, so when the source field types to one D3FEND digital artifact and the OCSF target path resolves to a mutually-disjoint one, the reasoner reports the mapping as a contradiction rather than relying on a hand-written rule for that case. The check is deterministic, which is what makes it usable in regulated, air-gapped, or safety-critical pipelines where you need a verdict that holds regardless of which model produced the mapping. Evidence tier B, a first-cut measurement: on a 925-row, six-schema corpus (Splunk CIM, Google UDM, Microsoft ASIM, Elastic ECS, OpenTelemetry, Zeek mapped to OCSF), every injected type-crossing was caught with no over-disjointness false positives.

Why the stack misses it

The gap is in the reference material, not your tools.

The honest starting point, and the part that surprised me when I went looking, is that this isn't a gap in your tools so much as a gap in the reference material your tools build on. The natural place to ground a security schema is D3FEND, MITRE's map of defensive techniques and the digital artifacts they act on, because D3FEND already defines what a Process and a File and a UserAccount are as things a defense operates against. So if you want a machine to check that your OCSF field mappings line up with reality, D3FEND is where you'd anchor them. The trouble is that D3FEND, off the shelf, gives a reasoner almost nothing to object to.

Here's the specific fact the whole thing turns on. A reasoner, which is just the program that works out what follows from what you've told it, can only flag a contradiction if you've stated the facts that contradict. To catch "you mapped a user account to a process," you need a stated rule that says a user account and a process can't be the same individual thing. That kind of rule has a name, disjointness, and it's the assertion on the map that two kinds of thing are genuinely distinct even when they're related to each other. D3FEND ships only three of these disjointness pairs in its entire ontology, and none of the three sit among Process, File, UserAccount, NetworkSession, and NetworkNode, which are exactly the artifacts your core OCSF objects map to. Those five are each just declared a kind of some parent, with nothing saying any two of them are distinct. So nothing in D3FEND says a thing that's a process can't also be a user account, which means a reasoner has no basis to complain when you map one to the other, and the wrong mapping sails straight through.

That changes where the silent failure lives. It isn't only the integration that wrote the bad mapping, and it isn't only the LLM that generated it, it's built into the reference ontology itself, because the assertions a machine would need to catch the error were never written down. You can run the strictest validation you like against raw D3FEND and it will catch nothing of this kind, not because your check is weak but because the map it's checking against has nothing to say on the question.

What the gate does

Ground each mapping twice, then ask a reasoner if both can be true.

So the gate I built adds the missing assertions and then leans on a reasoner the ordinary way. The mechanism has two moves, and neither is exotic once you separate them.

The first move is to ground each mapping twice, independently. For any given crosswalk row, the gate works out the D3FEND artifact the OCSF path targets (the deepest entity in the path it maps to, so actor.process resolves to Process, actor.user to UserAccount), and separately it works out what the source field itself means by typing it on its own (so a source field that's plainly a user gets typed as a UserAccount regardless of where the mapping sends it). Now you have two claims about the same row, from two directions, and a correct mapping is one where they agree.

The second move is to assert a disjointness layer over the handful of artifacts those mappings touch, the thing D3FEND left out, and then ask a reasoner whether the two groundings can both be true at once. A type-preserving mapping, a user mapped to a user, agrees, and the reasoner finds no contradiction, so the class stays satisfiable, which is the formal way of saying this could be a real thing. A type-crossing mapping, a user mapped to a process, says the same individual is both a UserAccount and a Process, the disjointness layer says nothing can be both, and the two together can't both hold, so the reasoner derives that the class is unsatisfiable, meaning a class that can't possibly have any members, an impossibility. The build exits non-zero and you get told. There's no model in the loop, no scoring, no probability, just logic over definitions, which is why I'm comfortable calling the result a catch rather than a guess.

The reasoner I used is ELK, run through ROBOT (a standard release jar) on Java 17, and the reason it's ELK specifically rather than a stricter reasoner is a D3FEND quirk worth knowing if you go to reproduce this: D3FEND reuses some names in two roles at once, which pushes it into a mode the stricter reasoners refuse to run on, and ELK is the one that tolerates it. That's a detail, but it's the kind of detail that eats an afternoon if nobody tells you, so I'm telling you.

The numbers, and what they don't say

It runs in seconds, and the boundaries are honest.

I measured this two ways, a clean first pass and then a run against a real corpus, because a mechanism that works on a toy proves nothing.

The first pass is the one-wrong-mapping demo. Take a small set of correct groundings, inject the exact silent error (map actor.user to Process, the actor-as-object confusion), merge it into the full D3FEND ontology rather than a trimmed slice, and reason over the whole thing. ELK isolates exactly the one wrong mapping as the single unsatisfiable class and does it in 3.69 seconds using about half a gigabyte of memory, on a laptop. That timing is the part that makes this practical rather than academic, because a check that runs in seconds on a laptop is a check you can drop into CI and run on every pull request, not a research artifact that needs a cluster.

The corpus run is the one that tells you whether the judgment holds at any scale. I ran the gate against a real six-schema crosswalk corpus, 925 mapping rows drawn from CIM, UDM, ASIM, ECS, OpenTelemetry, and Zeek into OCSF, of which 364 had both groundings and were therefore testable, across 231 distinct mapping classes. Against injected type-crossing corruptions, remapping a correctly-typed field to a disjoint artifact, the gate caught all of them, 231 of 231, with zero false positives attributable to the disjointness layer. And on the way through it flagged about eight coarse mappings in the hand-built corpus that a human ought to look at, an application mapped to a destination host, a UDM principal used as a host, a file path mapped to a process name, which is the gate doing the job you'd want, catching the meaning-crossings that a shape check waves through.

Now the boundaries, because the numbers oversell themselves if I stop there. This is Tier B evidence, my own groundings, my own disjointness adjudication, a single corpus, one reasoning pass. The 100% catch is measured on injected corruptions, not on a held-out set of confirmed human errors, because the corpus doesn't ship labeled mistakes, so the honest reading is "it misses nothing in the error class it's designed to catch," not "it catches every real-world mistake anyone has ever made." The eight organic flags are the closest thing to real catches, and they're plausible coarse mappings rather than confirmed-wrong ones. And the genuinely hard part, the part that decides whether this scales past my eight artifacts, is the disjointness adjudication itself, because over-asserting it manufactures false alarms. A credential can be stored in a file, so if I'd carelessly asserted that a credential is never a file, a perfectly valid "this credential lives in a file" mapping would fail the build for no good reason. The distinction that saves it is that disjointness is about identity, not relation: a process is executed from a file and the two are related, but the individual that is the process is not also the file, so asserting they're distinct is sound. Getting that call right for every artifact pair is bounded human work that needs real judgment, and it held across this corpus's eight artifacts, but a much larger surface could hit a wall this one didn't, and I'd rather you hear that from me than discover it after you've trusted the gate further than it's earned.

Running it yourself

The pieces are open, and the fixes belong upstream.

The reason to write this up at the practitioner level is that you can run the check, not just read about it. The open pieces all exist today, D3FEND and OCSF for the maps, Sigma and the pySigma OCSF mapping for the detection side, ROBOT and ELK for the reasoner toolchain, and the six-schema crosswalks for a corpus to point it at, and none of that is mine to gate behind anything. The single biggest snag when you go to reproduce it is Java, because ROBOT and ELK want a specific JDK and a version mismatch will stop you before you ever see the gate fire, so wrap the toolchain in a container and let one command be the whole interaction. The demo is packaged to clone and run with the Java toolchain wrapped, and it's up now at the gate directory in security-data-that-works, with the six-schema corpus it runs against at ocsf-mapping-fidelity in the lab. Once it runs, the most useful thing you can do is point it at your own field mappings and see what it surfaces, because the eight coarse mappings it found in a public corpus are the kind of thing that's probably also sitting in yours.

And when it flags something real, the move I'd make is to take the fix upstream rather than keep it. If the gate surfaces a coarse mapping, that's often a missing disjointness pair D3FEND should ship, or a mapping OCSF or a pySigma pipeline should sharpen, and those projects have open issue trackers waiting for exactly that kind of grounded report. The whole stack underneath this is a commons, and the gate is most valuable as a way to find the specific, defensible contributions worth filing against it, so the better outcome isn't that you have a private check, it's that the reference material everyone builds on gets the assertions it was missing. The mistake that kills a detection is catchable, and the reason it mostly hasn't been caught is that nobody wrote down the rules that let a machine see it, which is work anyone reading this is equipped to help with, pair by adjudicated pair.