Public production architecture teardowns
Real, named, validated.
The strongest evidence in the catalog: production deployments at named organizations, reconstructed from the public record — conference talks, engineering blogs with measured outcomes. Not SDW engagements; what makes them trustworthy is that the facts are independently verifiable, citations and all. The validators wall below is the broader teardown evidence base.
Production validators
Proven at scale, in regulated industries.
These are the production teardowns the catalog is built on — nine named deployments across three tiers (internet/cloud scale, regulated industries, security-specific), running on the engines the catalog details. Real, validated, publicly attributable; citations link out.
Internet & cloud scale
Cloudflare
Quadrillion-row scale. 1.61 Q events queried in < 2s. DNS analytics, bot management, security logging.
ClickHouse
Comcast
10+ PB security data fabric. Hot retention > 1 year. 50K IOCs swept across 10 PB in < 30 min.
Snowflake
Zero Trust FGAC at the Trino query layer. Credential Vending Service issues per-user temporary STS tokens for security analysts querying massive S3 telemetry.
Trino / Presto
Regulated industries
Standard Chartered
Global bank replaced a traditional SIEM with a self-managed, distributed multi-cloud lakehouse. 80% faster time-to-detect, 92% faster investigation (bank-reported, DAIS 2025).
Databricks
Bank Hapoalim
Federated data-lakehouse on Trino/Starburst. Cross-border access governance plus near-real-time AML monitoring on the federated data; analytics-led, not a SOC-telemetry build.
Trino (via Starburst)
DNB
Norway's largest financial group. Cyber Defense Center moved off Databricks onto in-house DuckDB architecture.
DuckDB (via Ibis)
Security-specific deployments
Palo Alto Networks
Cortex XSIAM real-time security monitoring via stream processing. Mitigates threats with minimal delay at extreme event volumes.
RisingWave
RunReveal
Security data platform built natively on ClickHouse for HTTP analytics and massive log aggregations.
ClickHouse
Ziggiz.ai
Cyber Lakehouse-as-a-Service. 30-50% cost reduction vs. three leading SIEMs (Ziggiz-published); onboarding shrunk from 9 months to 5 days.
Databricks
Sources: Cloudflare, Snowflake, Databricks customer case studies; published case material per company.
Public production architecture teardown
$5K/mo
Huntress on ClickHouse
MDR/EDR business operating at fleet scale. Replaced Elasticsearch with ClickHouse Cloud on the same workload — driven by economics, not vendor advocacy…
Read the breakdown →
Public production architecture teardown
250 GB/min
Okta on DuckDB-in-Lambda
Security data platform built around serverless OLAP. DuckDB runs inside AWS Lambda for normalization and operational metadata harvesting, eliminating the…
Read the breakdown →
Public production architecture teardown
$2.30/GB
Microsoft Sentinel on Azure
Azure-native managed SIEM built on Log Analytics Workspace (columnar Kusto storage) with KQL as the query language. Schema-on-read at the storage layer…
Read the breakdown →
Public production architecture teardown
$0.24/GB
Google Chronicle on BigQuery
Managed SIEM layered on BigQuery — separation of storage and compute, columnar Capacitor format on Colossus, schema-on-write normalization to UDM at ingest…
Read the breakdown →
Public production architecture teardown
1 PB/day
Falcon LogScale — brute-force scan architecture
CrowdStrike-owned (acquired Humio 2021) log platform built on the inverse of conventional indexing: ~1 MB time-series index per day, compressed segments on…
Read the breakdown →
Public production architecture teardown
1.61 Q events
Cloudflare — ClickHouse + DataFusion on R2
Edge-network analytics at quadrillion-row scale, run on ClickHouse for nearly a decade — and now paired with R2 SQL, a distributed query engine built on Apache…
Read the breakdown →
Public production architecture teardown
10+ PB
Comcast — Security data fabric on Snowflake
Cybersecurity-at-Comcast moved off siloed, single-tool analytics onto a unified Snowflake-backed security data fabric. Schema normalization across endpoint…
Read the breakdown →
Public production architecture teardown
Off Databricks
DNB — DuckDB + Ibis + marimo, off Databricks
Norway's largest financial services group moved its Cyber Defense Center off Databricks notebooks onto an in-house platform built from composable open-source…
Read the breakdown →
Public production architecture teardown
17k+ nodes
Pinterest — Zero Trust FGAC at Trino + Gravitino
Pinterest's Monarch big-data platform — 30+ Hadoop YARN clusters, 17k+ nodes on AWS EC2, petabytes processed daily — runs Trino as one of several engines on…
Read the breakdown →
Public production architecture teardown
9 mo → 5 days
Ziggiz — Cyber Lakehouse-as-a-Service on Databricks
First public production reference to ship the Databricks-native Cyber Lakehouse pattern as a service — Delta Lake for storage, Unity Catalog for governance…
Read the breakdown →
Public production architecture teardown
80%
Standard Chartered — self-managed SIEM on Databricks
A global systemically-important bank replaced its traditional SIEM with a self-managed security lakehouse on Databricks — a distributed, multi-cloud Delta Lake…
Read the breakdown →
Public production architecture teardown
Federated
Bank Hapoalim — federated lakehouse on Trino/Starburst
Israel's largest bank migrated off Hive onto Starburst (Trino) over a Hadoop-based data lake — federated SQL access that leaves data where it lives. The…
Read the breakdown →
Public production architecture teardown
40%
Yale New Haven Health — SIEM modernization with Cribl + Sentinel
A major US health system hit its Splunk license ceiling when a Palo Alto software update added 63 fields to every firewall log, pushing daily ingest from 400…
Read the breakdown →
See how the pattern lands on your workload.
The matrix scoring that justified each reference architecture's tool choices is the paid deliverable. The benchmark behind it is public — reproduce it on your own workload, then book a call to scope the work.