Security Data Works

Component reference

Cribl Search — query in place, never rehydrate

Engine-anchored architecture that queries telemetry directly in cheap object storage. Cribl Stream writes Parquet partitioned by date and source; Cribl Search dispatches ephemeral operators to where the data sits. No ingestion into the hot SIEM tier required for historical analysis.

40%

Log volume reduction at Yale New Haven Health across 30,000 endpoints during their migration to Microsoft Sentinel. A separate Fortune 1000 IT Services deployment achieved 99.99% reduction in Virtual SOC traffic on the same pattern.

The pipeline

  1. Edge

    Cribl Edge / Stream

    Route, reshape, reduce; mask PII; normalize to OCSF

  2. Store

    Object storage (Parquet)

    S3 / Azure Blob / Cribl Lake; partitioned by date and source

  3. Engine

    Cribl Search operators

    Dispatched to data; partition pruning; no egress

  4. Serve

    SIEM + SOC

    Only high-fidelity alerts forwarded to Sentinel / Splunk

What composes, what’s brittle

  • Three-plane model. Management, Control, Data planes; operators are ephemeral.
  • Regional / Geo Split. Search co-located with storage bucket bypasses cloud egress fees.
  • Route, reshape, reduce. G-Cloud TCO: 35% savings at 100 GB/day, 64% at 1 TB/day.
  • Why this matters. Historical telemetry stays queryable without SIEM hot-tier rates.
  • Best fit. 7-year compliance retention, forensics, ad-hoc hunts on cold data.
  • What's brittle. Partition discipline; cardinality choices materially affect query performance.

Sources: Cribl engineering blog · Yale New Haven Health published case · G-Cloud TCO figures (35% at 100 GB/day, 64% at 1 TB/day)

See how the pattern lands on your workload.

The matrix scoring that justified each reference architecture's tool choices is the paid deliverable. The benchmark behind it is public — reproduce it on your own workload, then book a call to scope the work.