Methodology meets architecture

PEAK and the lakehouse: how modern data stacks enable threat hunting.

Threat hunting without a framework is just ad-hoc investigation: analysts following hunches with no systematic approach. The PEAK framework from Splunk's SURGe research team (David Bianco, Ryan Fetterman, Sydney Marrone) gives that work structure: Prepare, Execute, Act with Knowledge. PEAK was written for the SIEM era, and the operational workflow still holds, but the data architecture underneath it has changed.

Reading time: about 16 minutes. Evidence tier: B overall. PEAK is a published Splunk SURGe framework (Tier A for the framework itself), the lakehouse mapping draws on practitioner deployments and reference architectures (Tier B), and the cost comparisons are directional estimates based on vendor pricing and my own TCO models. I flag the speculative pieces inline.

What PEAK is

A framework, not a tool.

PEAK (Prepare, Execute, Act with Knowledge) was published by Splunk's SURGe (Security Research) team. The authors most closely associated with the framework are David Bianco (creator of the Pyramid of Pain), Ryan Fetterman, and Sydney Marrone. The framework's stated goal is to replace ad-hoc hunting with a repeatable process that produces durable artifacts: automated detections, baselines, and institutional knowledge, rather than one-off investigations that live in an analyst's head.

PEAK defines three hunt types. Hypothesis-driven hunts start with a specific adversary behavior claim ("credential dumping followed by lateral movement within ten minutes") and test it against data. Baseline hunts establish normal so that abnormal becomes visible; they're the prerequisite for most behavioral detection. Model-Assisted Threat Hunting (M-ATH) uses machine learning to narrow the search space before human review.

The four phases are deliberately cyclical, not linear. Knowledge produced at the end of one hunt feeds the next round of preparation. That detail matters more than it sounds; most SOC hunting programs I see stall because they treat each hunt as a discrete project rather than as a contribution to a cumulative knowledge base.

The framework is methodology rather than technology, so it does not require Splunk, and it does not require any particular table format, query engine, or storage layer. What it does require is that the underlying data platform make each phase tractable, which is the seam the lakehouse argument runs through.

Prepare phase

What changes when retention is no longer a budget line.

The Prepare phase covers threat-intelligence review, hypothesis selection, scope definition, and verification that the required data sources are actually retained for the window the hunt needs. Traditional SIEM architectures squeeze this phase from one specific direction, retention cost.

Schema-on-read SIEMs typically charge $2–4/GB/month for indexed, searchable storage. At 10 TB/day ingestion, 90-day retention costs roughly $2–3 million per month at list pricing. Extending to a full year approaches $10–12 million per month. Most organizations don't pay that; they archive to cold storage and accept that data older than 90 days is effectively unsearchable without rehydration. That economic reality dictates what hunts are even possible to prepare for.

Mandiant's M-Trends reports have consistently shown median adversary dwell times in the range of 10–20 days, but the 75th-percentile tail extends well past 50 days, and named APT campaigns routinely maintain persistence for 6–18 months before discovery. A 90-day retention window can investigate most commodity intrusions, but it cannot prepare a hunt against a slow-moving adversary that established persistence eight months ago.

A lakehouse architecture changes the storage economics by roughly two orders of magnitude. Iceberg or Delta tables on S3 or ADLS Standard-IA run $0.01–0.02/GB/month for the underlying object storage. The same 10 TB/day at three-year retention costs roughly $13K–25K/month in storage rather than $32M/month in indexed SIEM tier, which isn't a tuning gain so much as a structural change in what Prepare can consider on day one.

The Prepare phase also depends on metadata: knowing what data sources exist, what schema they're on, when coverage started and stopped. Wiki pages and Confluence documents are the default in most SOC environments. They're also chronically out of date. A catalog-backed lakehouse (Polaris, Unity Catalog, Glue) makes that metadata queryable. Before starting a hunt, I can run a SQL query against the catalog to confirm which tables contain the relevant fields, what their earliest timestamp is, and whether schema changes during the proposed hunt window will affect query construction.

I want to flag what doesn't change, because threat-intelligence review, MITRE ATT&CK familiarity, and hypothesis quality are analyst-skill questions, and while the architecture can make more data available, it cannot make the analyst better at picking hunts, which is something PEAK gets at and the lakehouse has nothing to say about.

Execute phase

Query latency is an analyst-attention problem.

The Execute phase is where the hunt actually happens: exploratory queries, multi-table joins, least-frequency analysis, statistical outlier review. The bottleneck most often isn't query correctness but query latency, and what that latency does to analyst flow.

Schema-on-read SIEM query engines were designed for keyword search across log lines, not for multi-table joins across columnar data. Splunk SPL, Elastic KQL, and Microsoft Sentinel KQL all handle simple searches well. Complex hunt queries (three-table joins between EDR process events, network connections, and authentication logs across a 90-day window) frequently take 5–20 minutes to return. I've watched experienced analysts lose the thread of an investigation in that gap. They alt-tab to email, the iteration cycle breaks, and the hunt either stops or restarts with degraded context.

Columnar OLAP engines run the same queries in a different cost class. StarRocks, ClickHouse, and DuckDB use vectorized execution, cost-based optimizers, and column-pruning that lets them touch only the fields the query references. Practitioner benchmarks I've seen (and my own testing against synthetic OCSF-shaped data) put three-way joins on the order of 5–30 seconds at security data scale, depending on the engine and the query shape. I'd resist reading that as a "10× faster" marketing line, because what it really buys is an analyst who stays in flow instead of one who loses the hunt while the query runs.

The Execute phase also benefits from a multi-engine strategy that PEAK doesn't prescribe but the lakehouse enables. DuckDB sits inside a Jupyter notebook for exploratory analysis: the analyst can run a query, get an Arrow result back into pandas, train a quick model, and iterate without leaving the notebook. StarRocks handles production-shaped multi-table joins at scale. ClickHouse powers real-time dashboards and the alerting layer. It's the same underlying data (Iceberg or Delta tables) with three engines optimized for three different jobs, so the hunt moves through them without data migration.

One honest caveat is that the 5–30 second latency claim depends on partition strategy, file-size distribution, and compaction discipline, because a neglected Iceberg table with a million tiny files runs slower than a tuned one and operational maintenance carries cost. The lakehouse doesn't eliminate engineering work; it shifts it from query-time waiting to ingest-time tuning, which is a better trade for hunting workflows but not a free lunch.

Act phase

From scheduled queries to pipeline detection.

PEAK's Act phase is the bridge between hunting and detection engineering. A successful hunt (one that produces a confirmed adversary pattern) turns into a deployed detection. The deployment shape used to be: write a SIEM correlation rule, schedule it to run every 5–15 minutes, document the rule in a wiki page, and call it done. That cadence was tolerable when the relevant adversary timeline was measured in hours or days.

The 2026 numbers don't support that cadence anymore. CrowdStrike's 2026 Global Threat Report measured a 27-second fastest recorded adversary breakout time, the interval from initial access to lateral movement. Mandiant's M-Trends 2026 reported negative mean time to exploit (MTTE) for some campaigns, meaning exploits landing days before patches were available. A detection cadence of 5–15 minutes misses the response window for a 27-second breakout by more than an order of magnitude, and that gap is structural rather than a matter of tuning the schedule tighter.

Pipeline-based detection (running detection logic in-stream against the event flow before data lands in storage) is the architectural response. Tools in this space include Tenzir, Cribl Stream, Apache Flink, and the streaming-detection layer inside platforms like Lakewatch. The pattern is consistent: events flow through a processing pipeline that applies detection logic, emits high-confidence findings as a separate signal stream, and writes the underlying telemetry to the lakehouse for hunting and forensics. Detection latency drops from minutes to seconds or sub-second.

For the Act phase specifically, this changes two things. First, the analyst's hunt query becomes deployable as a pipeline operator, not only as a scheduled SIEM rule; many platforms accept Sigma rules or SQL fragments that translate directly into stream operators. Second, the documentation artifact changes shape. Rather than a wiki page that drifts, the detection's metadata (hypothesis, data sources, precision target, validation dataset, MITRE ATT&CK mappings) lives in the catalog as structured data, queryable alongside the hunts that produced it.

I'd put a caveat on the latency claim. Pipeline detection is sub-second to seconds for in-stream rules. For detections that require correlation across a longer window (say, "credential dumping followed within ten minutes by lateral movement") the latency depends on the windowing model and the state store, and is closer to seconds-to-tens-of-seconds. That's still an order of magnitude better than 5–15 minute SIEM scheduling, but it's not zero. Cross-link to pipeline detection and streaming for the deeper technical picture.

Knowledge phase

Institutional memory you can actually query.

The Knowledge phase is the one most SOC hunting programs skip, and it's the one that compounds. PEAK's premise is that each hunt should leave behind durable artifacts: a documented hypothesis, a validated detection, a baseline measurement, a list of techniques covered or not covered. Without that capture step, the next hunter re-discovers what the last hunter already learned.

The default capture mechanism is unstructured documentation: Confluence pages, Notion docs, Word files. The failure mode is predictable. The wiki page drifts from the deployed detection, gets forgotten when the original author leaves, and becomes unfindable within six months. Six months later, an analyst writes a hunt query that duplicates work from two years ago without knowing it.

A catalog-backed lakehouse offers an alternative: store hunt metadata as structured rows in catalog tables, with foreign keys to the detection rules they produced, the MITRE ATT&CK techniques they covered, the threat actors they targeted, and the validation datasets that prove they worked. The schema isn't complicated: hunt_log, detection_rules, validation_results, with relationships between them. The catalog tables sit alongside the security event tables and can be queried with the same SQL the analyst is already writing.

That structure opens up queries the wiki model can't answer. "What detections cover T1003.001 (LSASS memory dumping)?" becomes a SQL query against the catalog rather than a search across PDFs. "Which hunts targeted APT28 and what did they produce?" same thing. "What's our detection coverage of the Pyramid of Pain at the TTP layer?" becomes a join between hunt_log and a MITRE technique reference table. None of these queries are new ideas; Palantir's Alerting and Detection Strategy framework and the broader detection-as-code movement (MITRE, Expel, Sigma rule maintainers) have advocated this pattern for years. The lakehouse is the layer that makes it cheap to implement.

The lakehouse is not the only way to do this. A Postgres database with the same schema would work for the catalog itself. The advantage of putting hunt metadata into the same lakehouse as the security telemetry is that joins between metadata and telemetry become trivial: "show me all the events that fired detection X in the last month" is one query, not a federation problem. For an SRE or detection engineer doing efficacy review, that's the workflow that compounds.

Model-Assisted Threat Hunting

M-ATH narrows the search space, not the analyst's role.

M-ATH is PEAK's name for the model-assisted hunt type, and the phrase that captures it best is "human hypothesis, machine breadth." The analyst generates the question, the model assists with narrowing a million-row result set to a few hundred candidates worth human review, and the analyst then validates, confirms true positives, rejects false positives, and uses the validated set to either deploy a detection or refine the model.

The technique used most often in published M-ATH work is unsupervised anomaly detection, where Isolation Forest (Liu et al., 2008) is the canonical choice for tabular security data because it handles high-dimensional feature spaces well and produces an interpretable per-row anomaly score. DBSCAN clustering and autoencoder-based approaches also appear. The common shape: the model surfaces the statistically unusual subset, and the analyst applies domain knowledge to separate "unusual and bad" from "unusual but legitimate."

The lakehouse is well-suited to M-ATH for a specific reason: the same data the model trains on is the same data the production detection scores against. There's no train-test-deploy gap where the model is built on a sampled extract and then deployed against a different schema. The DuckDB-in-notebook pattern is the typical prototyping path: load Arrow buffers from Iceberg, fit a scikit-learn or PyTorch model, validate against held-out data, then either deploy the model into the pipeline (Tenzir, Flink) or translate the model's high-confidence threshold into a SQL detection rule for StarRocks.

I want to be careful about overclaiming what M-ATH delivers. Published precision numbers for Isolation Forest on authentication anomalies vary widely; I've seen anything from 30% to 85% depending on the feature set and the labeling rigor. The technique works best when the analyst has already reduced the search space via Prepare-phase hypothesis selection, and when there's a validated baseline to compare against. Dropped onto raw event data with no priors, anomaly detection produces mostly noise. That's not an M-ATH limitation; it's a property of unsupervised learning on high-cardinality real-world data.

For the operational side of running M-ATH at scale, see MLOps for threat hunting in the reference architectures section, which covers model registry, validation datasets, and the feedback-loop discipline that separates a working M-ATH program from a notebook full of one-off experiments.

Mapping the phases

PEAK on a lakehouse: a phase-by-phase view.

Bringing the phases together. The mapping isn't a one-to-one component diagram; PEAK is methodology and the lakehouse is architecture, so the relationship is "this phase becomes more tractable when this layer of the stack is in place."

PEAK phase	What it needs	Lakehouse contribution
Prepare	Long retention, queryable metadata, hypothesis scope	Iceberg/Delta on S3 at $0.01-0.02/GB/mo; catalog (`Polaris`, `Unity`, `Glue`) for source lineage
Execute	Fast multi-table joins, exploratory iteration, ML prototyping	Columnar engines (`StarRocks`, `ClickHouse`, `DuckDB`) reading shared tables; 5-30s joins vs. 5-20min SIEM
Act	Low-latency detection deployment, structured documentation	Stream processors (`Tenzir`, `Flink`) for sub-second; catalog rows for detection metadata
Knowledge	Queryable institutional memory, hunt-to-detection lineage	Catalog tables (`hunt_log`, `detection_rules`) joined to security telemetry; same SQL

The cycle closes when Knowledge feeds back into Prepare. The next hunt starts from "what have we already covered, what techniques are gaps, what baselines do we have available." That feedback loop is what turns a hunting program into compounding capability instead of a series of disconnected investigations.

Maturity context

Where PEAK fits on the detection-maturity ladder.

PEAK assumes a certain operational maturity level. The framework's value compounds with the organization's place on the Hunting Maturity Model (HMM) ladder, also a David Bianco contribution, predating PEAK. HMM levels run from 0 (no hunting capability) through 4 (leading, automated, feedback-driven hunting program).

For HMM0–1 organizations, PEAK is aspirational. The framework can guide which hunts to attempt first, but the data infrastructure and analyst skill aren't there yet to fully execute. The lakehouse work in this phase is mostly foundational: get telemetry into Iceberg, validate that the catalog reflects reality, run a few baseline queries to confirm the data is usable.

For HMM2–3 organizations, PEAK is operational. Hypothesis-driven and baseline hunts are running on a regular cadence. The lakehouse contribution is removing latency from the Execute phase and giving the Knowledge phase a structured home. Most organizations I see in this band have the analyst skill already; what they don't have is the data platform that lets the analysts work without waiting 10 minutes per query.

For HMM4 organizations, PEAK is just the methodology; the work shifts to M-ATH, automated detection deployment, and feedback loops between hunting and detection engineering. The lakehouse is essential rather than optional; the same data needs to serve exploratory hunts, model training, production detection, and forensic investigation, without architectural duplication.

For the deeper HMM walkthrough and how the maturity progression maps to specific architecture decisions, see detection engineering maturity in this series.

Implementation sketch

A 12-month path from HMM2 to PEAK on a lakehouse.

This is the rough shape I sketch when a SOC team asks where to start. Treat it as a hypothesis to test against your own constraints, not a prescription.

Months 1–3: foundation

Land two or three high-value data sources (firewall logs, CloudTrail, EDR) into Iceberg or Delta tables on S3 or ADLS. Stand up a catalog (Polaris, Unity, Glue) with the tables registered. Add DuckDB inside an analyst notebook environment for exploratory queries. Verify the data is usable (fields are present, timestamps are reliable, joins work) before investing in hunt cadence. Storage cost in this phase is modest, typically under $5K/month for 1-year retention on a few TB/day.

Months 4–6: exploratory hunting

Run two to four PEAK hypothesis-driven hunts per month against the lakehouse. Document each hunt in structured form (hypothesis, data sources, query, result, decision) and capture the metadata in catalog tables, not wiki pages. Add a production query engine (StarRocks or ClickHouse) for multi-table joins at scale. The goal in this phase is to validate that the lakehouse architecture actually accelerates the analyst's iteration cycle, and to produce 5–10 validated hunt hypotheses worth converting into detections. Investment increases by roughly $1.5–3K/month for the query engine.

Months 7–12: detection deployment

Convert the validated hunts into deployed detections. The deployment target depends on the detection's latency requirement: pipeline detection (Tenzir, Flink) for sub-second use cases, materialized views or scheduled queries against the lakehouse for slower-tempo detections. Document each deployed detection in the catalog with hypothesis, validation dataset reference, and precision target. Begin the feedback loop: measure detection efficacy, retire low-precision rules, refine high-value ones. Pipeline-detection costs vary widely ($50–300K/year depending on tool choice and scale); budget conservatively.

The 12-month version is intentionally cautious. Faster timelines are possible for teams with stronger starting infrastructure, but most of the work that compounds is in the discipline of structured documentation and the Knowledge-phase feedback loop, not in adding components to the stack.

Honest caveats

Where the lakehouse argument over-promises.

A few places where the "lakehouse enables PEAK" claim deserves more friction than the marketing version of this argument tends to apply:

Operational maintenance is real cost. Iceberg and Delta tables need compaction, snapshot expiration, and schema-evolution discipline. A neglected lakehouse runs slower than a tuned one and accumulates technical debt that compounds. SOC teams without data-engineering support tend to underestimate this.
Schema drift kills hunts silently. When an upstream source changes a field name or a vendor renames a sourcetype, downstream hunt queries break in ways that may not surface until an analyst notices a hunt is suspiciously quiet. OCSF helps; LLM-assisted OCSF mapping helps more; neither eliminates the problem.
The cost comparison is generous to the lakehouse. SIEM list pricing is a high bar. Most organizations negotiate substantial discounts (30–60% is typical). The "$2–3M/month at 10 TB/day" number is list, not actual. The lakehouse advantage is real but probably closer to 5–10× than the 100× the raw math suggests.
Pipeline-detection skill is scarce. Most SOC analysts can write SPL or KQL. Far fewer can write Tenzir pipelines, Flink jobs, or streaming SQL. The capability change matters; the hiring market is the binding constraint for most teams.
PEAK works without a lakehouse. The framework is methodology, not architecture. Splunk-shop SOCs run PEAK successfully today. The lakehouse changes the economics and the latency profile; it does not change whether the framework applies.

None of these are reasons to dismiss the architectural shift. They're reasons to size expectations correctly and treat lakehouse adoption as an engineering investment with operational consequences, not a free win.

The argument

PEAK is the right methodology. The lakehouse changes what's affordable to do with it.

The Splunk SURGe team built PEAK to give threat hunting a repeatable structure: Prepare, Execute, Act with Knowledge, with hypothesis-driven, baseline, and model-assisted hunt types running through a cyclical workflow that produces durable artifacts. That methodology is sound on its own terms. It predates the lakehouse and doesn't require one.

What the lakehouse changes is the budget envelope for each phase. Prepare gets long retention at roughly two orders of magnitude lower storage cost, which makes APT-scale hypotheses tractable. Execute gets columnar query engines that turn 10-minute waits into 10-second iterations, which keeps the analyst in flow. Act gets a pipeline-detection deployment path that closes the gap between detection latency and adversary breakout time. Knowledge gets a queryable catalog instead of a wiki that drifts.

The honest version of the claim is narrower than "the lakehouse makes PEAK better." The lakehouse makes PEAK affordable at a maturity level (HMM3–4) that SIEM-only economics push out of reach for most organizations. That's a structural argument, not a marketing one, and it's the kind of argument worth testing against a SOC's actual constraints rather than accepting at face value.

If you're evaluating where to start, the practical recommendation is small. Pick one PEAK hypothesis-driven hunt your current SIEM struggles to execute at retention scale. Land the same data in an Iceberg table on S3. Run the hunt in DuckDB. Measure the difference in iteration cycle and analyst flow. Then decide whether the broader architectural shift is worth the engineering investment for your team. The answer may be yes; the answer may be "not yet." Either way the methodology stays put, and the data layer underneath it is the part that's actually up for decision.