Splunk federated integration with the MOAr stack

Methodology

Integration pattern for organizations with deep SPL investment that need lakehouse economics on 30+ day data. Keep Splunk as the analyst UI; route high-value signals to Splunk indexers (30-day hot) and all data to Iceberg (1–3 year cold); bridge them with DB Connect (production, on-prem + cloud) or Federated Search (Splunk Cloud only, beta). Analysts keep SPL; storage cost goes to lakehouse rates.

3.6–10.1×

Speedup vs. the schema-on-read SIEM foil (OpenSearch 2.18.0, 2.854 s) on the canonical sdw-lab zeek-flagship-rerun (10M Zeek events, single host, Tier B): Trino 3.6× (0.795 s), StarRocks 8.3× (0.343 s), ClickHouse-on-Iceberg 10.1× (0.282 s); ClickHouse native's single-query best case (the old "145×" headline) is superseded by the CV-gated OpenSearch re-run, which puts ClickHouse-native at 46.8× on the five-query average (21–62× on the hunting-shaped queries). Methodology and code public; the schema-on-read tier degrades roughly 8× from 1M → 10M events. The win is the tier move, so pick the cold-tier engine on catalog maturity, concurrency, and operating cost rather than on the spread between those multipliers.

The pipeline

Route

Cribl / Tenzir / Vector

Dual-write: high-value to Splunk, all data to Iceberg
→
Hot tier

Splunk indexers (30 days)

Real-time alerts; native SPL speed for recent data
→
Cold tier

Iceberg on S3 (1–3 years)

OCSF-normalized; columnar Parquet; partition pruning
→
Bridge

DB Connect (JDBC) or Federated Search

SPL query plane spans both tiers
→
Query

Trino / Dremio / StarRocks / ClickHouse

Standard SQL on the cold tier; detection content portable

What composes, what’s brittle

Why this works. Splunk's schema-on-read parsing tax shows up on repeated dashboards; pre-normalize to OCSF and pay once.
DB Connect. Production-grade, on-prem + cloud; 2–5× slower than the schema-on-read SIEM's native search (single host, Tier B), 10–50× cheaper storage.
Federated Search. Splunk Cloud only, beta; direct Iceberg; DSU "use-it-or-lose-it" meter — high-frequency queries can cost more than ingestion.
SPL feature gaps. transaction, datamodel, inputlookup, real-time alerting on federated sources are not supported.
Best fit. Heavy SPL investment + long-retention compliance + analyst retraining cost too high to swallow at once.
What's brittle. Federated Search beta SLA; SPL → SQL translation gaps; multi-cloud catalog independence (Glue vs. Polaris).

Sources: SDW Splunk DB Connect benchmark, December 2025 (github.com/flying-coyote/splunk-db-connect-benchmark) · Splunk Federated Search for S3 documentation · Cisco Data Fabric announcement (Sep 2025) · Splunk .conf25 federation evolution

Splunk federated integration with the MOAr stack

See how the pattern lands on your workload.