H-ARCH-12 · Tier C · 3/5

AI-native vs. AI-augmented.

Every security vendor is suddenly "AI-powered." Every data platform is "AI-ready." Beneath the AI-washing, a smaller cohort is doing something architecturally different — rebuilding data infrastructure from scratch for AI-first workflows, rather than adding chatbots to the BI surface that's already there. This page tracks the hypothesis, the convergent evidence, and the validation gaps.

The market signal

"The analytics world is moving from BI to AI."

That's Jay Kreps — the creator of Apache Kafka and the CEO of Confluent — announcing in November 2025 that Confluent and Databricks were entering a formal partnership to connect real-time streaming with lakehouse analytics specifically for AI workloads. The post pulled 802 reactions on LinkedIn, the highest engagement captured across hundreds of data engineering posts I've tracked through the year. Kreps doesn't run a startup; he defined the streaming infrastructure category. When his framing for the next decade is "BI to AI," that's a category-level statement, not a marketing line.

Kreps' wording: "this isn't just about enabling insights, it's about powering actions. To make this real, analytical data needs to work at the speed of operational applications." Insights to actions. Batch to real-time. Human dashboards to agent workflows. That's not adding an AI feature; that's the data infrastructure layer admitting the legacy patterns are finishing their useful life.

Within two months of that announcement, three more independent vendors made convergent moves. The consistency of the signal across vendors that compete with each other is what makes this worth tracking as a hypothesis rather than dismissing as one vendor's marketing pivot.

Convergent moves

Two competitors. Same diagnosis. Different rebuilds.

Cribl: agentic telemetry.

Clint Sharp, Cribl CEO, at CriblCon25 in October 2025: "The agentic AI era is here, and it's pushing traditional telemetry infrastructure past its breaking point. Legacy systems were built for humans reading logs. They will collapse under the load of AI agents issuing thousands of queries a minute." Cribl's framing for the rebuild: "next generation architecture" unifying human, machine, and AI-generated context into one intelligent data layer, with a performance claim of "10× the queries at half the cost."

The performance claim is unverified — no production customer data has been published. The framing is the meaningful signal: Cribl saying their own customers' query patterns are about to change by an order of magnitude in volume, and the existing telemetry pipeline can't carry that.

Tenzir: AI-orchestrated streaming fabric.

Matthias Vallentin, Tenzir CEO, in late 2025: "Your security vendor just told you they'll add support for that critical data source 'in Q3 next year.' Cool. Cool cool cool. This is the game we've all been playing. Vendors control the integration catalog. Customers wait. Everyone pretends this is fine. It's not fine." Tenzir's rebuild is an MCP server (the Model Context Protocol — a standard interface for AI agents to call tools and access data) that AI-generates parsers, OCSF mappings, and test suites from a single log sample. Their positioning: "100% hands-off keyboard." Performance claim: 100+ Gbps ingest with zero-copy Apache Arrow.

Cribl is solving the AI-consuming-data problem (query optimization for agent workloads). Tenzir is solving the AI-generating-integrations problem (pipeline code automation). Same diagnosis — legacy infrastructure inadequate, performance breakthroughs required, new architectural patterns rather than bolt-on features — different parts of the same rebuild.

Databricks MCP Catalog: the fourth data point.

Two security-data vendors making bold claims is interesting; the lakehouse leader making the same kind of move is the data point that promotes the pattern. Databricks announced MCP Catalog in November 2025 (512 LinkedIn reactions): MCP servers integrated with Unity Catalog governance, AI agents accessing data with centralized governance, "enterprises can now securely connect AI agents to external tools and data." When Databricks — 70,000+ customers, $43B valuation as of 2024 — rebuilds their catalog layer for MCP-protocol agents, the AI-native rebuild thesis stops looking like security-vendor marketing and starts looking like an industry-level pattern.

The distinction

Adding features versus rebuilding the foundation.

AI-augmented = adding features.

Dashboard with an LLM chat interface. "Ask questions in natural language." AI-suggested detection rules. Automated anomaly detection. These are AI features bolted onto BI infrastructure. The data still moves through batch ETL (extract-transform-load — the traditional pattern of moving data on a schedule rather than continuously). The dashboards still refresh on a 15-minute cadence. The analysts still stare at the same Kibana or Splunk surface. The AI just adds a chatbot layer on top.

Every SIEM vendor's product announcements over the last 18 months have been variations on this pattern. It is not without value — natural-language queries against BI data are a useful affordance — but it doesn't change what the infrastructure can carry.

AI-native = rebuilding the foundation.

If AI agents will issue thousands of queries per minute (not hundreds of human analysts per day), the infrastructure has to look different. Query engines optimized for machine access patterns rather than human dashboards. Sub-second response times that hold under high concurrency, not 30-second dashboard loads under light load. Context unified across human, machine, and AI-generated data. Cost structure that supports 100× query volume.

If AI can generate parsers and schema mappings from log samples, the integration architecture has to look different. Pipeline-as-code generation rather than click-based UI configuration. MCP-protocol integration rather than human-oriented REST APIs. Test automation for AI-generated code rather than manual QA. Customer ownership of the integration surface rather than vendor gatekeeping.

If analytics is moving from BI to AI, the streaming-to-analytical bridge has to look different. Real-time streams flowing into the analytical lakehouse rather than batch ETL feeding warehouses. Operational speed for AI actions rather than insight discovery latency for humans. Unified governance across streaming and lakehouse rather than siloed tools.

These aren't features. They're architectural prerequisites for AI-first workflows. The vendors making the rebuild commitments are signaling that they think the prerequisite gap is going to start mattering inside the 2026–2027 procurement cycle.

Why now

The 2024–2025 inflection.

The shift from "AI-powered" marketing to "AI-native" rebuilding happened fast, and it's traceable in the vendor positioning archive.

Pre-2024. Vendors add LLM chatbots to dashboards. Cool. Incremental. The chatbot is a feature; the infrastructure underneath is unchanged.

Mid-2024. GenAI agent frameworks (LangChain, CrewAI, AutoGPT) demonstrate agents issuing hundreds of tool calls per task. The shift from "chatbots" to "agents" becomes real. Production engineers start noticing what happens to a SIEM under an agent that can't sleep.

Late-2024 to early-2025. Vendors realize traditional infrastructure can't carry agent query patterns. The "BI to AI" recognition surfaces as internal R&D direction.

October–December 2025. Infrastructure vendors — Cribl, Tenzir, Confluent, Databricks — announce AI-native rebuilds within an eight-week window. These aren't GenAI application vendors (OpenAI, Anthropic). These are the data infrastructure vendors saying their own platforms need fundamental redesign. When the infrastructure layer admits legacy patterns are broken, the shift is real.

What this means in procurement

Stop evaluating AI features. Evaluate AI readiness.

The wrong questions are the ones every vendor RFP response is already optimized to answer: Does your platform have an AI-powered dashboard? Can I ask questions in natural language? Do you have automated anomaly detection? These evaluate whether the AI-augmented feature set is present, not whether the infrastructure underneath can carry agent-scale workloads.

The right questions are sharper. Can your infrastructure handle a 100× increase in query volume from AI agents? What's your query latency on machine access patterns specifically — not human dashboard latency? Can AI generate integrations end-to-end, or do humans still configure them? Do you support the MCP protocol, or only REST APIs designed for human consumption? What's your roadmap for streaming into the analytical data platform, given that real-time action requires it?

Performance benchmarks need to update too. Time-to-insight for human analysts and dashboard load time and concurrency for 100 human users — those are BI benchmarks. The AI-native equivalents are query latency under 10,000 agent requests per minute, ingest throughput for real-time action workloads, integration generation time for new data sources, and streaming-to-action latency end-to-end. A vendor making AI-native claims who can only show the human-dashboard benchmarks is making AI-augmented claims with a relabeled headline.

On timing: shops deploying in 2025 should stick with proven AI-augmented patterns; the AI-native infrastructure isn't production-hardened yet. Shops designing for 2026–2027 should plan around AI-native prerequisites — real-time streams, agent query optimization, MCP integration. Greenfield environments in 2025 are the right place to pilot-test the new patterns in non-production environments while the category matures.

Hypothesis status

H-ARCH-12 · what's known, what isn't, what would change the answer.

The full claim: security data infrastructure is undergoing a shift from human-centric (BI dashboards) to AI-native (agentic systems), requiring fundamental architectural changes for performance, integration automation, and unified context management. Current confidence is 3 out of 5 — meaningful, but not load-bearing yet.

What supports it. Four independent vendors making convergent moves inside an eight-week window. CEO-level positioning, not just product marketing. Convergent diagnosis (legacy infrastructure inadequate) with divergent treatments (query optimization, integration automation, streaming-to-lakehouse). The category-defining inflection quote ("BI to AI") from Jay Kreps, who has structural credibility in the streaming infrastructure space.

What's missing. Production deployments with published metrics. Independent validation of the headline performance claims (Cribl's 10×, Tenzir's 100+ Gbps, Tenzir's "100% hands-off OCSF generation" accuracy). Practitioner adoption beyond early adopters. The current evidence base is Tier C-D — vendor announcements at the CEO level, with the engagement signal as a corroborating proxy for resonance, but no independent reproduction of the performance numbers yet.

What would change the answer. Production deployments of "agentic" telemetry systems with published query-volume and latency metrics. Independent benchmarks of AI-generated OCSF mapping accuracy on real EDR or cloud logs. Practitioner adoption data showing whether AI-native architectures are actually being preferred at the procurement layer, or whether AI-augmented BI tools continue to dominate. Active testing of the Tenzir MCP server is on the lab roadmap; results will land here when the work is complete.

The hypothesis updates as the evidence does.

The other anchor hypotheses — including the existing AI-asymmetry hypothesis this one extends — and the contradictions log are on the research page. The thesis page connects them to the program POV.

Back to research → See the lab roadmap