Security Data Works

Methodology

MLOps for threat hunting

Operationalizing Model-Assisted Threat Hunting (M-ATH) from the PEAK framework. A notebook model degrades the moment telemetry shifts or an adversary adapts; MLOps is the engineering discipline — feature store, model registry, orchestrated retraining, drift monitoring — that keeps an algorithmic hunt viable in production.

Level 2

CI/CD-automated retraining is the threshold where M-ATH stops being a one-off notebook hunt. Google Cloud's MLOps maturity model: Level 0 (manual notebook, no drift defense) → Level 2 (pipelines that detect drift and retrain autonomously). Below Level 1, the model is stale before the hunt ends.

The pipeline

  1. Prepare

    PEAK + feature store

    Frame the hypothesis; engineer features once in Feast — consistent offline training and online inference.

  2. Train

    M-ATH model

    Supervised classification, clustering, time-series, NLP on petabyte telemetry; experiments tracked in MLflow.

  3. Deploy

    Orchestrated retraining

    Kubeflow / ClearML pipelines retrain and redeploy on schedule or on drift — continuous delivery, not point-in-time.

  4. Monitor

    Drift + MLSecOps

    Data- and concept-drift detection (W&B); poisoning / evasion guardrails feed back to the retrain loop.

What composes, what’s brittle

  • PEAK framework. Bianco, Fetterman, Marrone (Splunk SURGe). M-ATH is the algorithmic hunt type alongside hypothesis-driven and baseline.
  • Data vs concept drift. Distribution shift vs adversary adaptation — both degrade silently into a false-positive avalanche or a false-negative blind spot.
  • Why Level 0 fails. A notebook model trained offline and discarded cannot counter non-stationary, adversarial telemetry.
  • Tooling. Feast (features), MLflow (registry / Detection-as-Code), Kubeflow & ClearML (orchestration), W&B (drift + reasoning observability).
  • MLSecOps. The hunt infra is itself an attack surface — retrain-loop poisoning, evasion inputs, MLflow CVE-2026-2635, the Kubeflow Doki incident. MITRE ATLAS.
  • What's hard. Continuous-retraining cost; poisoned retraining loops; the org gap between data science and detection engineering.

Sources: Splunk PEAK Threat Hunting Framework (Bianco, Fetterman, Marrone — SURGe); "The Threat Hunter's Cookbook" (Fetterman & Marrone); Google Cloud MLOps maturity model (CI/CD for ML); MITRE ATLAS; CVE-2026-2635 (MLflow authentication bypass); PROID compromise-assessment framework (peer-reviewed, PMC).

See how the pattern lands on your workload.

The matrix scoring that justified each reference architecture's tool choices is the paid deliverable. The benchmark behind it is public — reproduce it on your own workload, then book a call to scope the work.