Security Data Works

Vendor blueprint · ELT pattern

Fivetran + dbt — ELT for the security data lake

Managed extraction (Fivetran) plus in-warehouse transformation (dbt) as the ELT spine of a security data lake: Fivetran lands cloud, identity, and SaaS logs into Snowflake / BigQuery with privacy controls; dbt normalizes raw logs to OCSF inside the warehouse, with no separate transform compute. GA products — the security-specific public evidence is real, but thinner than the general data-engineering reputation suggests.

1

What ships today

Fivetran: managed connectors (AWS, GitHub, Jira, identity providers) into security data lakes; column blocking, hashing, and RBAC before data lands; Hybrid Deployment keeps pipelines inside the network perimeter; webhook push of Fivetran's own logs to Google Security Operations. dbt: version-controlled ELT mapping disparate logs to OCSF inside Snowflake.

2

Where the public evidence is

Concrete and security-specific: Heritage Environmental Services (Fivetran Hybrid Deployment for regulated data); the documented Fivetran → Google Security Operations webhook integration; the published dbt → OCSF normalization pattern in Snowflake. The pattern is verifiable.

3

Where it isn't

"Brex, Coinbase, Rippling use Fivetran/dbt for security" is reputational, not a named public security case. They are cited for advanced data-engineering practice generally; treat the security-specific attribution as unproven until a public case names it.

4

What it changes for architects

ELT-to-OCSF moves normalization out of the SIEM and into the warehouse the org already runs. The honest critique: managed ingestion is a recurring per-connector cost and a data-egress decision — weigh it against Cribl / Vector routing, and validate on your own sources before assuming Brex-tier polish.

Sources: Fivetran documentation (Hybrid Deployment; Google Security Operations webhook integration; column blocking / hashing / RBAC); dbt + OCSF normalization public references (dbt Labs / Snowflake security data lake guides).

See how the pattern lands on your workload.

The matrix scoring that justified each reference architecture's tool choices is the paid deliverable. The benchmark behind it is public — reproduce it on your own workload, then book a call to scope the work.