Writing · MCP

MCP beyond chat. If it works.

Most people think MCP — the Model Context Protocol Anthropic open-sourced in November 2024 — is about making chatbots smarter by giving them access to your calendar, your email, your Slack. That's one use case. In November 2025, two infrastructure vendors signaled a more interesting one: MCP as the orchestration layer for AI-generated security data pipelines. The claims are big. The validation isn't here yet. This page is where I'm tracking it.

What MCP actually is

One protocol for AI-to-tool integration. The same problem, solved once.

Every AI application needs to connect to external systems — databases, APIs, file stores, SaaS platforms. Before MCP, every vendor built proprietary connectors. Anthropic wrote a Google Calendar connector for Claude. OpenAI wrote a different Google Calendar connector for ChatGPT. The same data source, two different integrations, four (eight, sixteen) when you add the next AI vendor. MCP standardizes the interface. Build one MCP server for your data; any MCP-compatible AI client can use it.

The original use case is the obvious one — Claude reading your Slack, ChatGPT querying your Postgres database, an AI agent searching your Google Drive. Useful, table stakes for enterprise AI agents, the protocol's reason for existing. The interesting move in late 2025 was watching infrastructure vendors — not GenAI application vendors — start using MCP for something different.

The expansion

Two announcements, two weeks apart, from vendors that don't usually compete.

Tenzir: AI-generated OCSF parsers.

Tenzir announced an MCP server on November 11, 2025. The pitch: paste a single log sample — EDR, cloud service, identity provider, anything — and the MCP server generates a complete parser, an OCSF mapping (OCSF being the Open Cybersecurity Schema Framework, the multi-vendor schema standard), a test suite, and a deployable package. Tenzir's framing: "100% hands-off keyboard." Production-ready integration in one conversation. Their CEO Matthias Vallentin's stronger framing: "the power dynamic just flipped" — vendors no longer control the integration catalog; customers do.

The old workflow for OCSF integration ran 2–5 days for an experienced engineer per vendor source: read vendor documentation, map fields manually, write parser code, write tests, deploy, debug. Plus the ongoing maintenance burden when the vendor changes their schema. The new workflow Tenzir is claiming runs in hours, with regeneration replacing manual updates when schemas drift. If accurate, that's roughly a 10× shift in the economics of security data integration. If the accuracy is closer to 80%, the gain is closer to 2–3× — still real, but less transformative.

Databricks: governance for AI agents.

Databricks announced MCP Catalog on November 7, 2025 — 576 LinkedIn reactions, 19 comments. The product: a central registry for MCP servers, integrated with Unity Catalog governance, including a marketplace of trusted MCP servers from S&P Global, Moody's, Dun & Bradstreet, FactSet, Glean, and Nasdaq. Plus "Agent Bricks," a multi-agent orchestration layer that uses MCP Catalog servers as building blocks. The framing isn't "chat with your data better." The framing is MCP as the control plane for AI agents operating on enterprise data platforms with proper governance. A security analyst can ask an agent to investigate unusual wire transfers; the agent uses MCP servers to pull entity risk profiles, query governed transactional data through Unity Catalog, cross-reference sanctions lists, and generate an investigation report — with Unity Catalog enforcing access control across the whole flow. That's not a chatbot use case.

Why this would matter for security data engineering

The integration bottleneck is most of the work.

A typical enterprise security stack runs 100+ vendors with proprietary log formats, APIs, and schemas. Every SIEM-or-data-lake integration requires a custom parser per source. Vendor roadmaps decide when your specific log source gets first-class support. The maintenance burden compounds because vendor schemas change. Roughly 80% of security data engineering time is integration grunt work — vendor API connectors, schema mapping, parser maintenance, test case generation. Real architecture work is the other 20%.

Two consequences if AI-generated parsers actually work at production quality. The first: vendor integration timelines collapse from "wait for the SIEM roadmap" to "paste log sample, generate parser, deploy in hours." Vendor roadmaps lose leverage; customers control the integration timeline. The second: OCSF adoption economics change. The current barrier to OCSF is the manual mapping cost (hours per vendor) and the scarce expertise required to do it correctly. If AI handles the mapping at high accuracy, the OCSF barrier comes down and adoption accelerates.

Both consequences are conditional on the same thing: production-grade accuracy. That's the part that hasn't been validated.

The skepticism I can't shake yet

Three failure modes that would matter, even if the demo works.

The 80% solution problem.

AI might generate parsers that work for 80% of common log samples and fail on edge cases — nested JSON structures, vendor-specific fields with unstable semantics (Okta's debugContext field is a classic example), multi-format logs from the same vendor across different products with incompatible schemas, and the enrichment logic that goes beyond field mapping into joining context from other sources. If AI-generated parsers need 20% manual correction in production, the productivity gain shrinks from 10× to 2–3×. Still useful, not transformative.

The production reliability problem.

Generating a parser in a demo is easy. Running one reliably in production is the hard problem. Error handling for malformed logs and unexpected fields. Performance characterization across the 10 GB/day to 10 TB/day spectrum. Schema drift detection — when the vendor changes format, does the parser break loudly or silently? Observability for debugging AI-generated code that no human wrote. If the generated parser fails in production and requires manual debugging by an engineer who didn't write it, the maintenance burden returns at higher cognitive cost than the manual baseline.

The OCSF complexity problem.

OCSF has 40+ event classes, 200+ attributes, and complex inheritance hierarchies. Mapping vendor logs to OCSF requires understanding which event class actually applies (Authentication versus Account Change versus User Access Management — these are distinct, and vendors mix them in their own logs), how to handle vendor-specific fields that OCSF doesn't have a slot for, and the enrichment logic that goes beyond field mapping. If AI consistently picks the wrong OCSF class, downstream analytics break in subtle ways — detection rules don't match because the events are filed under the wrong category, and the failure mode is "we missed the alert" rather than "the parser crashed." Garbage-in, garbage-out, but the garbage is OCSF-shaped.

What I'm testing, and what would change the answer

The hypothesis is testable. The lab roadmap covers it.

The hypothesis I'm tracking: the Model Context Protocol is being adopted beyond GenAI chat interfaces to data pipeline automation and governance, enabling AI-generated infrastructure code for security data engineering. Current confidence: 3 out of 5 — interesting pattern, vendor-only evidence, no production validation.

What's planned for Q1 2026: install the Tenzir MCP server, run it against five real security log sources (Okta, CrowdStrike EDR, AWS CloudTrail, plus two more), measure OCSF mapping accuracy with the goal of characterizing it above and below the 90% threshold, and document failure modes specifically. The comparison case is the manual OCSF mapping effort for the same sources. Results — positive, negative, or mixed — will land on the lab page when the work is complete. No predetermined conclusion; the value of publishing is in being on the record either way.

What would move the confidence up. Production case studies showing OCSF mapping accuracy above 90% on diverse sources. A second wave of vendors beyond Tenzir and Databricks adopting MCP for data engineering rather than just chat. Independent benchmarks on parser accuracy across difficult log formats. What would move the confidence down. Testing showing accuracy below 70%, requiring extensive manual correction. No production adoption beyond the original announcements by Q4 2026. Vendor positioning that quietly retreats from "100% hands-off" to "starting point requiring manual completion" — which would be honest but would also reframe the value proposition.

For now: this is a promising pattern worth watching, not a validated architectural shift. Plan accordingly. Don't bet your integration strategy on unvalidated claims; don't ignore the pattern either.

Promising pattern. Validation pending. Updates planned.

The lab roadmap on the lab page covers what's being tested next. The AI-architecture hypothesis on the research page sits in the same neighborhood as this one, with its own tracked confidence.

Back to writing → See the lab roadmap