Security Data Works

Technology deep-dive

OCSF ontological grounding for federal-ready security data.

The Open Cybersecurity Schema Framework (OCSF) is one of four foundational open standards I name when I describe the security lakehouse, alongside Apache Arrow for in-memory representation, Apache Iceberg for the on-disk table format, and Sigma for detection-rule portability. The piece that turns OCSF from "another vendor-coalition schema" into a defensible choice for defense, intelligence, and regulated buyers is its grounding in MITRE's D3FEND ontology, which itself grounds in DoD-designated Common Core Ontologies and an ISO-standard top-level ontology. That chain is what may matter to a federal procurement officer in a way Splunk CIM, Microsoft Sentinel UDM, and Google Chronicle UDM cannot currently match.

Reading time: about 20 minutes. Evidence tier: A for the primary sources (MITRE D3FEND 1.0 release notes, OCSF v1.3.0 GitHub release, ISO/IEC 21838-2:2021), B for the DoD-designation reporting and defense-contractor cost framing, and D where I label speculation about how procurement language will evolve. Framework-level statements throughout; actual compliance still depends on a specific Authority to Operate.

Grounding method · measured map

A first-cut measured map of how OCSF and D3FEND connect At the class level, OCSF and D3FEND are linked by 69 reciprocal crosswalk links, which corrects an earlier reading that the reverse side was empty. At the leaf level, the picture is thin: 97.7 percent of OCSF leaf attributes are orphaned, meaning they have no D3FEND grounding at all, and only 2.3 percent are grounded. The bar makes the contrast the centerpiece: a defender can only catch what the schema actually grounds, and the measured map shows that grounding is currently sparse at the leaf level even though reciprocal links exist at the class level. Class level — reciprocal links exist OCSF classes 69 reciprocal crosswalk links the reverse side is not empty after all D3FEND artifacts Leaf level — where the grounding is thin Of all OCSF leaf attributes, how many have any D3FEND grounding at all: 97.7% orphaned — no D3FEND grounding 2.3% grounded — the narrow blue sliver on the right of the bar. A defender can only catch what the schema grounds. Reciprocal links at the class level say the two standards do connect, but at the leaf level — the attributes a detection actually keys on — almost none of them resolve to a D3FEND artifact, so most of the schema has no grounding to check against yet.
This is a first-cut measured picture of how OCSF and D3FEND actually connect rather than an estimate. At the class level the two are joined by 69 reciprocal crosswalk links, which corrects an earlier reading that the reverse side was empty, yet at the leaf level (the attributes a detection actually keys on) 97.7% of OCSF leaf attributes are orphaned, with no D3FEND grounding at all. That gap is what matters, because a defender can only catch what the schema grounds, so most of the schema offers nothing to check against even though the class-level links exist. Evidence tier B, a first-cut measurement rather than a settled figure.

Why this changed in 2026

OCSF is moving from vendor consortium to international standards body.

OCSF was launched in 2022 as a vendor-coalition project: AWS, Splunk, IBM, and a long tail of partners agreeing on a shared event schema so that security data could move between products without a custom transformer at every boundary. The 180-plus organization coalition mattered politically, but a vendor consortium is still a vendor consortium, and for federal and regulated buyers that sits in a different category from an international standard.

OCSF is now on a standards-body track: ITU-T Study Group 17 (the security study group of the International Telecommunication Union, the United Nations body whose Recommendations are used by national regulators worldwide) is standardizing it as Recommendation X.icd-schemas. Member states backed it for ratification in December 2025, with adoption as a published international standard targeted for mid-2026. Ratification would move OCSF from "Linux Foundation project coordinated by a vendor coalition" into the same category as ISO/IEC standards: a governance-neutral, vendor-neutral artifact with a formal change-control process and international recognition. The procurement implications take time to flow through specific RFPs and ATO packages, but the trajectory is the kind that compounds rather than reverses.

The D3FEND ontological grounding I describe below predates the ITU-T work (it shipped with OCSF v1.3.0 in August 2024) but the two reinforce each other, because the ITU-T standardization track speaks to "is this becoming a real standard?" while the D3FEND chain answers "does the schema have formal semantics behind it, or just field names?" Both are necessary for a federal-ready posture, and neither one on its own gets a buyer all the way there.

Translating the jargon

Schema is not ontology, and the difference matters here.

Two terms do most of the work in this essay, and they're often used as synonyms when they shouldn't be. I want to lay them down carefully because the rest of the argument depends on the distinction.

A schema defines how data is stored, which is to say the field names, the data types, and the structure. For OCSF Network Activity events that means: class_uid is an integer, src_endpoint.ip is a string, traffic.bytes_in is a long integer. A schema tells a query engine how to read and write the data, but it does not tell anyone what the data means in a formal sense.

An ontology defines what data means, which is to say the formal semantics, the categorical relationships, and the rules that govern how concepts connect. Ontological grounding (the term I'll use throughout) refers to anchoring a schema's concepts in a formally defined category system, so that "Network Activity" doesn't just mean "the bag of fields the OCSF working group decided to call Network Activity" but means "an instance of a cybersecurity-relevant occurrence that takes place over time, in the formal sense defined by an upstream ontology."

That sounds abstract, so here's the concrete version. A schema tells you how to query traffic.bytes_in. An ontology tells you why you're querying it: this is an instance of Network Traffic Analysis (a defensive technique), which is a kind of Digital Artifact Analysis, which is a kind of cybersecurity event, which is a kind of occurrence-that-takes-place-over-time in the upstream formal sense. Each step in that chain is defined and validated independently, and the compliance argument I'll make later in the essay is that what gives the schema standing in a federal procurement is this chain rather than the field names themselves.

The term semantic interoperability falls out of the same idea, because two systems are semantically interoperable when they go past exchanging bytes with matching field names and actually agree on what those fields mean. Without ontological grounding, semantic interoperability is mostly a wish, since both sides hope they're talking about the same thing, whereas with grounding it becomes a derivable property: if both sides chain back to the same upstream ontology, the meaning is preserved by construction.

The four-link chain

OCSF, D3FEND, Common Core Ontologies, Basic Formal Ontology.

The grounding argument has four links. I'll walk each one because the chain is only as defensible as its weakest hop.

Link one: OCSF maps to D3FEND

D3FEND stands for Detection, Denial, and Disruption Framework Empowering Network Defense, a MITRE-maintained knowledge graph of defensive cybersecurity techniques. It's the defender-side counterpart to MITRE ATT&CK, which catalogs adversary tactics. Where ATT&CK says "adversaries do this," D3FEND says "defenders may detect, deny, or disrupt that with these techniques." MITRE released D3FEND 1.0 in January 2025, and that release was the first time the framework was formally grounded in upstream ontologies rather than presented as a standalone knowledge graph.

OCSF v1.3.0, released August 2024, added a d3fend attribute to every event class. The attribute contains a D3FEND technique identifier, the digital artifact being analyzed (network traffic, process, file, user credential), and the defensive tactic (Detect, Deny, or Disrupt). For a Network Activity event, that may look like:

{
  "class_uid": 4001,
  "time": "2026-01-10T14:32:00Z",
  "src_endpoint": { "ip": "192.168.1.100", "port": 54321 },
  "dst_endpoint": { "ip": "10.0.0.5", "port": 443 },
  "d3fend": {
    "technique_id": "D3-NTA",
    "artifact": "NetworkTraffic",
    "tactic": "Detect"
  }
}

The practical effect: an OCSF event carries its own mapping to a MITRE-maintained defensive technique. A query like "show me every event that supports Network Traffic Analysis (D3-NTA)" can be expressed as WHERE d3fend.technique_id = 'D3-NTA' across every OCSF source. Without the d3fend attribute, that mapping has to be reconstructed manually for each proprietary schema, a consulting project that defense-contractor sources I've spoken with quote in the $150K-$400K range for full coverage of 500 to 2,000 detection rules. I treat that cost range as Tier B evidence (practitioner interviews, three firms specializing in DoD cybersecurity compliance), not a published number.

Link two: D3FEND grounds in Common Core Ontologies

Common Core Ontologies (CCO) are a set of mid-level ontologies (Agent, Artifact, Event, Information, Organization, Facility) that the Department of Defense and Intelligence Community designated as their baseline ontology framework. Reporting by DefenseOne and the DoD Data Strategy place that designation in January 2024. I treat the designation date as Tier B evidence (journalism plus official strategy document) rather than Tier A. I have not personally read the underlying memorandum, and federal designation language is often more conditional than the press coverage implies.

What CCO provides for D3FEND: a formal home for concepts like CyberEvent, SecuritySensor, and DigitalArtifact. Rather than D3FEND defining "Network Traffic Analysis" in isolation, the D3FEND 1.0 release expresses it as a specialization of a CCO Event category, with formal has-input, produces-output, mitigates, and detects relationships. That allows automated reasoning over D3FEND. A SPARQL query (a standard query language for ontologies) can ask "what defensive techniques mitigate ATT&CK T1071 Application Layer Protocol?" and get a derived answer rather than a manually maintained lookup.

Link three: CCO grounds in Basic Formal Ontology

Basic Formal Ontology (BFO) is a top-level ontology, the most general categorization layer, dealing in things like Continuant (something that persists through time), Occurrent (something that happens over time), Object, Process, Role, and Function. BFO has been peer-reviewed and developed by the International Association for Ontology and its Applications for two decades. CCO is constructed as a BFO-conformant set of mid-level categories, so every CCO concept inherits a formal place in the BFO hierarchy.

The practical effect of that hop is that when CCO says "CyberEvent is a kind of Event," there's a formal definition behind both Event and the is-a relationship, so the claim is derivable from the underlying ontology rather than resting on the CCO working group deciding that CyberEvent feels like an Event.

Link four: BFO is ISO/IEC 21838-2

The final link is the one that gives the chain international standing: BFO was published as ISO/IEC 21838-2:2021, "Information technology, Top-level ontologies (TLO), Part 2: Basic Formal Ontology." That's a Tier A primary source, published by the International Organization for Standardization, governed by international committee, with a formal change-control process.

The composition reads as one continuous chain: an OCSF Network Activity event maps to D3FEND Network Traffic Analysis, which is a CCO CyberEvent, which is a BFO Occurrent, which is an ISO/IEC 21838-2-standardized category, and each hop is independently defined and validated. That chain is what the major proprietary schemas don't currently provide, because Splunk CIM, Microsoft Sentinel UDM, Google Chronicle UDM, and AWS ASFF all stop at the field-definition layer with no upstream ontological anchor.

Grounding method · disjointness layer

Eight D3FEND digital artifacts asserted pairwise-disjoint at the identity level Eight digital artifacts — UserAccount, UserGroup, Process, File, NetworkNode, NetworkSession, URL, and ServiceApplication — are asserted pairwise-disjoint at the level of identity. The adjudication is that disjointness is at identity, not at relation: a process is executed-from a file but is not the file; a credential is stored-in a file but is not the file; a URL locates a file but is not the file. Executed-from, stored-in, and locates are relations rather than identity, so the objection that these things overlap is a category error. Session is deliberately excluded from the disjoint set, because D3FEND already places NetworkSession under Session, and asserting Session disjoint from NetworkSession would make NetworkSession unsatisfiable. A selfcheck catches exactly that mistake. This extends D3FEND issue 423, which asserted top-level Artifact disjoint from Event and invited extension down the hierarchy. Eight artifacts, pairwise-disjoint at identity UserAccount UserGroup Process File NetworkNode NetworkSession URL ServiceApplication Disjoint at identity, not at relation A relation links two artifacts; it does not make them the same artifact. Process executed-from File is run from it, but is not it. Credential stored-in File sits inside it, but is not it. URL locates File points at it, but is not it. Why Session is left out of the disjoint set D3FEND: NetworkSession ⊑ Session Asserting Session ⊥ NetworkSession would make NetworkSession unsatisfiable — a selfcheck catches exactly this case before it ships.
The eight artifacts are asserted disjoint at the level of identity, which is the claim that does the work: a process is executed from a file, a credential is stored in a file, a URL locates a file, yet none of them is the file, because executed-from, stored-in, and locates are relations rather than identity, so the usual objection that these things overlap is a category error. Session stays out of the disjoint set on purpose, since D3FEND already places NetworkSession under Session and asserting the two disjoint would make NetworkSession unsatisfiable; a selfcheck catches that before it ships. This follows on from D3FEND issue #423, which asserted top-level Artifact disjoint from Event and invited the same treatment further down the hierarchy. Evidence tier B, a first-cut set of assertions rather than a ratified standard.

Honest hedge

Framework-level alignment is not an Authority to Operate.

I want to plant a flag here before going further, because this is the place where federal-procurement writing often overclaims. Everything I'm describing in this essay is a framework-level argument. That means: OCSF's mapping to D3FEND, CCO, and BFO may strengthen a procurement response, may reduce the manual ontology-mapping consulting burden, and may align with the kind of language that increasingly appears in DoD and Intelligence Community solicitations. None of that is the same as a product holding an Authority to Operate (ATO), or a system being FedRAMP-authorized, or a contractor meeting CMMC requirements for a specific contract.

ATOs are granted to specific deployed systems by specific Authorizing Officials against specific control baselines. FedRAMP authorizations are granted to specific cloud service offerings against FedRAMP control baselines. CMMC certifications are granted to specific contractors for specific contract scopes. A schema choice, even one with airtight ontological grounding, does not transfer those authorizations to a downstream system, and while it may make the documentation easier to produce, the control mappings easier to evidence, and the cross-agency data-sharing language easier to satisfy, none of that replaces the authorization process itself.

I also want to flag uncertainty about how DoD and IC procurement language will actually evolve. The DoD Data Strategy and CCO designation point in one direction. Defense contractors and integrators will translate that direction into specific RFP language at their own pace. RFPs I've seen in 2026 cite D3FEND mapping requirements; whether they explicitly require OCSF or accept "CCO-aligned schema" generically depends on the program and the contracting officer. The honest version: the wind is blowing in OCSF's direction, but I would not promise a specific solicitation outcome to a client based on schema choice alone.

With that caveat planted, the rest of the essay is about what the framework-level argument may actually buy a buyer who's evaluating schemas for the medium term.

What proprietary schemas don't offer

Splunk CIM, Sentinel UDM, Chronicle UDM, AWS ASFF.

The four proprietary schemas a federal or regulated buyer is most likely to encounter all stop at the same layer: defined field names, vendor-controlled semantics, no formal ontology, no native D3FEND mapping. None of these are bad schemas in their own context. Splunk CIM is a mature normalization layer for Splunk Enterprise Security, Sentinel UDM (Unified Data Model) is the Microsoft equivalent inside Azure, Chronicle UDM is Google's, and AWS Security Finding Format (ASFF) is what Security Hub uses internally. They work well within their vendor's product ecosystem.

The gap shows up when the buyer is federal, defense-adjacent, or cross-vendor by necessity:

  • D3FEND mapping. Proprietary schemas require manual mapping, a consulting project that runs $150K-$400K for full coverage, per the defense-contractor practitioner sources I cited earlier. OCSF v1.3.0 includes the d3fend attribute at the schema level. The mapping ships with the data.
  • CCO alignment. Proprietary schemas have no upstream ontology. A defense contractor responding to an RFP that asks for CCO-aligned semantic metadata has to either construct that alignment manually or argue it isn't required. OCSF inherits CCO alignment by transitivity through D3FEND.
  • Governance neutrality. Splunk CIM evolves at Splunk's discretion. Sentinel UDM evolves at Microsoft's. Chronicle UDM evolves at Google's. ASFF evolves at AWS's. OCSF evolves under Linux Foundation governance with a 180+ org coalition and an active ITU-T standardization track. That distinction may matter to a buyer who needs schema stability across an ATO cycle that outlasts a vendor relationship.
  • International recognition. A US vendor's proprietary schema is a US vendor's proprietary schema in any cross-border context. An ITU-T Recommendation would give OCSF a different kind of standing for NATO-aligned, Five Eyes, and EU-US data-sharing contexts. Whether that translates to specific cross-border programs depends on the program; Tier D for the specific procurement claims.

None of this means an organization currently running Splunk CIM should rip-and-replace, but it does mean that for new architectural decisions, especially in federal or federal-adjacent contexts, the schema-choice calculus has changed in OCSF's favor in a way that wasn't true two years ago.

What this looks like in practice

A detection rule with and without the chain.

The cleanest way to show the difference is a brute-force authentication detection rule, the kind that ships in every SIEM out of the box.

In the proprietary-schema flow, an analyst writes the rule in SPL (Splunk's query language) against the Authentication data model, then documents (in a separate file) that this rule corresponds to D3FEND technique D3-UAM (User Account Monitoring). A compliance officer who later asks "how do we know this rule aligns with D3-UAM?" gets the answer "we manually documented it." That answer is defensible if the documentation is current, the analyst who wrote it is still on the team, and the proprietary schema hasn't shifted underneath, but it isn't something you can derive from the data.

In the OCSF flow, the same rule is written in SQL against OCSF Authentication events. The d3fend.technique_id attribute is present in every event. The rule itself can filter on it:

SELECT
  src_endpoint.ip,
  actor.user.name,
  COUNT(*) AS failed_attempts
FROM ocsf_authentication_events
WHERE activity_id = 1
  AND d3fend.technique_id = 'D3-UAM'
GROUP BY src_endpoint.ip, actor.user.name
HAVING failed_attempts > 5;

The compliance officer who asks "how do we know this aligns with D3-UAM?" gets a different answer: the alignment is in the data itself, sourced from the OCSF schema definition, traceable to the D3FEND knowledge graph maintained by MITRE. That answer is derivable rather than asserted. The compliance documentation effort goes from "write a paragraph per rule" to "point at the schema."

For five rules, this difference is rounding error. For five hundred rules (the size of a typical enterprise detection portfolio) the per-rule documentation tax adds up. I treat the thirty-to-sixty-minutes-per-rule manual mapping estimate as Tier B (practitioner experience, not published), and the actual number varies widely with the maturity of the detection-engineering team and the discipline of the existing documentation. The directional point holds regardless, because a schema-level attribute is cheaper to evidence than a per-rule narrative.

One open question I want to flag honestly, because the rest of the essay leans on it. When a team uses the formal schema-graph as a validation oracle at design time, running a consistency check to confirm that a proposed rule-to-D3FEND mapping is well-formed before it ships (I run exactly this as the disjointness-layer OWL type-consistency check in the runnable gate, which caught all 231 of 231 injected type-crossings with no false positives), the accuracy and consistency gains that follow are real, but the harder question is what produces them. I've been attributing those gains to the grounding architecture itself, on the reasoning that the OCSF-to-D3FEND-to-CCO-to-BFO chain gives a machine something formal to reason over, so the mapping can be checked rather than asserted, and that's the moat argument. There's a confound I can't wave away, though, because any pipeline that adopted this grounding probably also upgraded the model doing the reasoning, and a more capable base model will produce better mappings whether or not the formal chain is present, so the grounding is the moat while base-model maturity is the tide. In a pipeline that moved on both at once the two are entangled, and crediting the architecture for what the tide may have carried is exactly the kind of claim I'd push back on if a vendor made it to me. The clean test isolates them, holding the model fixed and running the same mapping task with and without the formal grounding to measure the delta, and I'd want that delta measured on the rare and adversary-tail cases specifically rather than the common ones, because a strong model handles the common D3-UAM and D3-NTA mappings either way and grounding has to earn its keep where the model is least sure. I haven't run that experiment, and until someone does, the federal-ready claim rests on the chain's standing as a standard, which is documented, more than on a measured accuracy lift, which is not yet. Since drafting this I ran the clean test, holding the model fixed and running the same field-mapping task with and without the formal grounding, and the honest result is that the conceptual grounding came back close to inert against a plain schema-validity check on the mapping: the schema check moved the result, the grounding prose barely did. So I'm not going to claim the chain makes a model map better. What I did build instead is a read-only navigation server (a stdio MCP tool I call scg, over a concept-only graph of 1,442 nodes and 7,618 deduped edges across OCSF, D3FEND, ATT&CK, and NIST 800-53/CCI) whose discipline is provenance rather than accuracy, because every edge carries a proxy_quality and a documented trust rank, the largest class by far (roughly 6,000 of the 7,618 edges) is the intent-blind offense-to-defense artifact-co-occurrence inference scored lowest at 0.25, and a multi-hop answer is reported as only as trustworthy as its weakest edge, with a flag raised when the chain leans on one of those cheap inferred joins. That overlaps MITRE and CTID's own mapping work by design, so the graph isn't a rival crosswalk; what it adds is the proxy-quality layer on top of that work and the measured brittleness underneath, because the graph's structure changed a retrieval answer on only one of nine incident-reconstruction queries I tested, which makes it useful but narrow, and the right way to talk about it is honest navigation-with-provenance, not a grounding-accuracy moat I haven't earned the right to claim.

Sector implications

Who may actually benefit from the chain.

Defense and intelligence

This is the clearest case, because defense contractors building security data systems for DoD components, IC agencies sharing threat intelligence across agency boundaries, and prime integrators bidding on security analytics RFPs all face procurement language that increasingly references D3FEND mapping and semantic interoperability, and OCSF responses to those requirements are derivable where proprietary responses require an additional consulting deliverable.

The framework-level hedge applies in full force here: a specific contract still requires a specific ATO, a specific control mapping against a specific baseline (NIST 800-53 moderate or high, depending on the program), and a specific demonstration of operational security against that baseline, so OCSF makes the schema-and-mapping pages of that package easier to write without writing the rest of the package for you.

Critical infrastructure and regulated industries

Energy, water, financial services, and healthcare all have sector-specific regulators with their own framework references. CISA has been steadily building OCSF-based mappings for cross-sector threat intelligence sharing. The Zeek-to-OCSF mappings CISA validated at petabyte scale are an example of the federal government voting with its engineering hours. Sector-specific frameworks (NERC CIP, FFIEC, HIPAA) don't typically require OCSF, but they do require defensible mapping between detection capabilities and threat models. The D3FEND-grounded chain may shorten that mapping work.

For financial-services buyers specifically, the FFIEC's references to threat-informed defense and the SEC's 8-K disclosure rules both reward a security program that can articulate detection coverage against an external framework, and D3FEND is the obvious external framework while OCSF carries the mapping at the event level. The argument here isn't that OCSF means SEC compliance, but that OCSF makes the part of an SEC disclosure that describes detection coverage cheaper to evidence.

Academic and research

Cybersecurity research grants (NSF Secure and Trustworthy Cyberspace, DARPA programs, NIH for health-sector cybersecurity) increasingly reference formal semantic frameworks in their solicitations. BFO and CCO peer-reviewed standing make OCSF data citable in ways proprietary schemas aren't. A research group studying detection effectiveness across organizations needs a shared semantic frame to compare results; OCSF plus D3FEND provides one. I'd treat the "increasingly" claim as Tier B: observed in solicitations I've read, not a quantitative trend I've measured.

Commercial sector

For commercial-only organizations with no federal exposure, the ontological grounding argument is weaker. A 500-person SaaS company with no defense customers may run perfectly well on Microsoft Sentinel UDM and never need to think about CCO. The relevant lens for those organizations is optionality. Today's commercial-only company is sometimes tomorrow's federal-RFP respondent, and sometimes tomorrow's EU-data-sharing entity, and sometimes tomorrow's acquired-by-a-prime-contractor entity. OCSF preserves the schema-level option to engage with those futures, whereas proprietary schemas require migration work to engage with them, and since the cost of preserving the option today is roughly zero while the cost of migrating later is high, that asymmetry is what makes the case for OCSF in commercial contexts even where the federal argument doesn't bite directly.

When the chain doesn't matter

The honest version of "OCSF for everyone."

The argument I've made is sectoral. It applies most strongly where federal procurement, regulated industry, or cross-border data sharing is in play. There are real organizations for which the ontological grounding adds overhead without proportional benefit, and I want to name those clearly rather than pretend the standards stack is universally applicable.

  • Small-scale, single-vendor environments. A 200-employee company running all of Microsoft 365 plus Sentinel does not benefit much from OCSF's vendor neutrality. Sentinel UDM works inside the Microsoft ecosystem; the gap shows up only when crossing out of it.
  • No compliance requirements. If there's no federal customer, no regulated industry, and no cross-border data sharing, the compliance pathway argument simply doesn't apply. The other arguments for OCSF (vendor neutrality, ecosystem composition with Iceberg and Arrow) still hold, but they're weaker than the federal argument.
  • Under 1 TB/day of security data. At small scales, the cost of running an OCSF mapping layer may exceed the benefit. The lakehouse arguments I make elsewhere assume a certain scale to break even against a SIEM. Below that scale, the simplest path is often the right one.
  • Active SIEM migration in progress. If a team is mid-migration on a Sentinel or Splunk deployment, adding OCSF mapping mid-flight is rarely the right move. Finish the migration, stabilize, and revisit schema choice as a separate decision.

The honest framing is that OCSF plus D3FEND grounding is the right default for federal-exposed and regulated buyers in 2026, the right hedge for commercial buyers who may become federal-exposed in the medium term, and not strictly necessary for small commercial deployments, which is both a sharper and a more defensible recommendation than "OCSF for everyone."

International dimension

What ITU-T standardization may change.

Returning to the ITU-T track I mentioned at the top, the ITU is a UN agency whose Recommendations are referenced by national regulators worldwide, and ratifying OCSF as Recommendation X.icd-schemas would move it into the same category of artifact as the ISO/IEC standards that already underpin BFO: a governance-neutral published standard with formal change control and international standing.

For NATO-aligned threat intelligence sharing, the practical effect may be that allied national-CERT-style organizations can reference a single schema in their cross-border data-sharing agreements. The CCDCOE (the NATO Cooperative Cyber Defence Centre of Excellence) has historically advocated for ontology-grounded approaches, and OCSF now sits inside both the ontology chain (via D3FEND-CCO-BFO) and the ITU-T standards chain, which gives those organizations two reinforcing arguments rather than one.

For Five Eyes collaboration (US, UK, Canada, Australia, New Zealand) the picture is similar. Each of those countries has its own cybersecurity agency (CISA, NCSC, CCCS, ACSC, NCSC-NZ), and each has invested in OCSF mapping at varying levels of public commitment. An ITU-T Recommendation would give those agencies a less politically loaded standards reference than "the schema the US vendor coalition agreed on."

The hedge is two-fold. First, ratification is still pending: member states backed X.icd-schemas in December 2025 and the target is mid-2026, so it is on track but not yet a published Recommendation. Second, ITU-T Recommendations are voluntary, so national regulators choose which to adopt and how to reference them, and the trajectory I'm describing is the kind that takes years to translate into specific cross-border agreements, where the specifics may surprise me. I treat the "ITU-T standardization will lead to broad cross-border OCSF references" claim as Tier D: informed expectation, not documented outcome. The narrower claim, that OCSF is the basis of an active draft ITU-T Recommendation (X.icd-schemas) that proprietary schemas have no equivalent to, is Tier A based on the SG17 documents themselves.

The four-pillar frame

OCSF in the foundational-standards stack.

When I describe the security lakehouse to architects, I name four foundational open standards: Apache Arrow for in-memory representation, Apache Iceberg for the on-disk table format, OCSF for the event schema, and Sigma for detection-rule portability. Each pillar matters on its own, but the composition is what makes the stack work.

OCSF is the pillar that does the most distinctive work for federal and regulated buyers. Arrow and Iceberg are infrastructure-layer standards that govern how data moves and how it sits at rest, and they're vendor-neutral in the operational sense, while Sigma is a detection-portability standard that governs how rules move between SIEMs and lakehouses. OCSF is the pillar that touches the compliance argument directly, because the schema is where the formal semantics live and where D3FEND mapping is materially expressed.

The four pillars compose. An organization can write detections in Sigma (analyst-portable), execute them against OCSF-shaped data (semantically grounded), store that data in Iceberg tables (vendor-neutral at the storage layer), and move it across systems as Arrow buffers (columnar-native at the wire layer). Each layer has a non-proprietary alternative, documented production validation, and international standards-body recognition or its equivalent, and that composition is what actually argues for the lakehouse over a vendor-controlled SIEM, more than any single performance number does.

The OCSF-specific contribution to the composition is the federal-ready posture, because the other three pillars work for any analytical workload while OCSF is the layer where the procurement language and the compliance argument actually attach. Without it, the stack is still good engineering, and with it the stack is good engineering plus a defensible standards story for buyers who need one.

Practical guidance

What to do in 2026.

Four moves I'd recommend to a security architect evaluating schema choice today, ordered by how cheap they are to execute:

  • Track OCSF v1.3.0+ adoption in the products you already run. AWS Security Lake emits OCSF natively. Snowflake's security data offerings emit OCSF. Splunk and Microsoft both ship OCSF mappings as optional outputs. If you're already in one of these ecosystems, OCSF is often available without changing platforms.
  • For new architectural decisions, treat OCSF as the default schema. The compatibility cost is low, the future-optionality value is high, and the federal-ready posture is materially stronger than any proprietary alternative.
  • Push vendors for D3FEND attribute support during contract renewals. If a vendor claims OCSF support, ask whether they emit the d3fend attribute. The gap between "OCSF-shaped fields" and "full v1.3.0 including D3FEND mapping" is the gap between marketing claims and federal-ready posture.
  • Don't conflate schema choice with ATO work. If you're responding to a federal RFP, OCSF helps the schema and detection-mapping pages of the package. It does not replace the ATO process, the control mapping work, or the operational-security demonstration. Budget for the rest of the work accordingly.

OCSF is not the most exciting standard in the lakehouse stack, and Iceberg's V3 features and Arrow's ADBC story attract more attention, but OCSF may be the standard that does the most distinctive work for federal-exposed buyers, because it's the only layer where the formal-semantics argument and the procurement-language argument both attach. The D3FEND-CCO-BFO chain is the academic credibility layer that makes the schema defensible, while the ITU-T standardization track is the international standards-body layer that makes it governance-neutral, and both have advanced in the last 18 months, so both compound, though neither one on its own amounts to an Authority to Operate.