Marimo + LLM Pivot: Notebook NHI Reached the Bastion

The Incident

On 2026-05-10, the Sysdig Threat Research Team observed an end-to-end intrusion in which a large language model agent, not a human, drove the post-exploitation phase. Initial access was CVE-2026-39987 — a pre-auth remote code execution in Marimo’s /terminal/ws endpoint, fixed in marimo 0.23.0 (affected: <= 0.20.4) and added to CISA’s Known Exploited Vulnerabilities catalog on 2026-04-23 with a federal remediation deadline of 2026-05-07. After landing on the notebook host, the agent harvested two cloud credentials, replayed them through a fanned-out egress pool to call AWS Secrets Manager, and retrieved an SSH private key for a downstream bastion. It then opened eight short SSH sessions to the bastion and dumped the schema and full contents of an internal PostgreSQL database in under two minutes. End-to-end, the chain ran in just over an hour across four pivots.

Sysdig identified the operator as an LLM by behavioural signatures: commands engineered for machine parsing (--- separators, bounded output, suppressed errors, paging disabled), a HEREDOC bundling six SELECT statements into a single psql invocation, an improvised schema dump that targeted a credential table the resembled application did not actually contain, and a Chinese-language planning comment that leaked into the command stream. Four such signatures stacked inside one 113-second window, distributed across six IPs at sub-second pace. MITRE ATT&CK coverage: T1190 Exploit Public-Facing Application, T1555.006 Cloud Secrets Management Stores, T1552.001 Credentials In Files, T1021.004 Remote Services: SSH.

The Authority Path That Failed

The identity that carried execution authority at the load-bearing moment was not the LLM agent — it was the IAM principal attached to the compromised marimo host. That principal held read access to a specific AWS Secrets Manager entry containing an SSH private key for a downstream bastion. The scope the deploying operator intended it to exercise was “run a notebook process.” The scope it actually exercised, once any code at all was running on the host, was “enumerate and read arbitrary secrets, retrieve a long-lived bastion SSH key, and broker lateral movement into the database tier.”

The first trust anchor that failed was the missing authentication check on /terminal/ws — that’s the CVE, and per the marimo advisory the fix added the missing @requires("edit") guard. The load-bearing failure for our thesis is the secondary one: a long-lived SSH private key for a bastion was stored in Secrets Manager and reachable by a notebook workload’s machine identity. There was no scoping, no just-in-time issuance, no separation between “compute used to run notebooks” and “compute trusted to broker lateral movement to a database.” Either an IAM scope review of the notebook role, or an inventory query for “which non-human identities can read bastion SSH material,” would have flagged this chain before any attacker — human or model — got the wheel.

SecurityV0 Perspective

This is nhi_compromise (ASI06). The LLM agent is the operator that drove the abuse, but the authority every action rode on was non-human: the notebook host’s IAM read on Secrets Manager, then the bastion SSH key itself — a second NHI whose blast radius extended from the notebook tier all the way to the internal Postgres tier. The novelty in coverage is “AI ran the attack.” The novelty in the failure is older and more familiar: a machine identity could reach a secret it should never have been able to reach.

The evidence pack SecurityV0 would have produced before the intrusion enumerates, for every workload’s NHI, the secret-reachability graph: which keys, tokens, and certificates each identity can read, and what those credentials in turn unlock. The pre-exfiltration question the pack answers is “is there any path from a publicly-reachable compute tier to a credential that brokers database access?” The post-exfiltration forensic question is the same graph, queried at the moment of the alert: “which NHIs touched the bastion SSH key in the last hour, and what did each of them do with the credentials it obtained downstream?” When the operator is an LLM running at sub-second pace across six IPs, “rotate after detection” is too slow. The durable control is shrinking the held scope of the NHI before the agent ever gets the wheel.

What To Do

Patch marimo to 0.23.0 and confirm internet-exposure status. Versions <= 0.20.4 expose /terminal/ws without @requires("edit"). CISA’s KEV deadline for federal civilian agencies was 2026-05-07; assume opportunistic exploitation in any environment where the notebook port is reachable.
Audit every compute-tier IAM principal for secretsmanager:GetSecretValue against bastion or database credentials. Run aws iam simulate-principal-policy (or your provider’s equivalent) for every workload role against the ARNs of long-lived SSH keys and database passwords. Anything that resolves should be scoped down or removed.
Replace long-lived bastion SSH keys with short-TTL credentials brokered through an approval step. AWS Systems Manager Session Manager, Teleport, or Vault-issued SSH certificates with a 15-minute TTL all remove the reachable-secret from the chain. A static key in Secrets Manager is a load-bearing failure waiting to happen.
Alert on LLM-tool-call shell signatures in command-execution telemetry. HEREDOCs bundling multiple SELECTs into one psql, --- separators, cmd 2>/dev/null | head -N patterns, and PAGER=cat or --no-pager flags all show up when a model is the operator. None is conclusive alone, but their co-occurrence inside a two-minute window is high-confidence.
Map the secret-reachability graph and treat the longest paths as priority risks. For every NHI in the inventory, enumerate which secrets it can read and what those secrets in turn authorize. A notebook compute role → Secrets Manager → bastion SSH key → internal Postgres chain should be impossible by policy, not just by hope.

The Incident

The Authority Path That Failed

SecurityV0 Perspective

What To Do

Sources