Composio Breach: When Tool Registration Became Execution

The Incident

On May 21, 2026, Composio — an agentic tool-calling platform that brokers AI-agent access to external services such as GitHub, Gmail, Jira, Slack, and Notion — disclosed unauthorized access to internal systems and customer connection data. No CVE was assigned; this was a compromise of Composio’s infrastructure, not a flaw in a shipped package. During the breach window, an auxiliary cache service held 5,241 API keys with elevated compromise potential. Composio also listed 5,001 GitHub connections among the affected connections, about 0.3% of active connections overall. The company revoked user GitHub tokens as a precaution and deleted customer API keys created before its May 22 cutoff, with deletion beginning on May 23.

According to Composio’s incident report, the intrusion began with compromised employee Gmail OAuth tokens and magic-link sign-in through auxiliary systems from the staging environment. From there, the attacker gained a foothold in an internal agentic tool used to monitor infrastructure and report connector failures. They then abused that tool to reach automated remediation systems — the components trusted to fix broken connectors. Inside the sandboxed execution environment, the attacker registered malicious tool definitions and chained them until they could execute arbitrary code within the tool-execution sandbox.

MITRE ATT&CK coverage: T1528 (Steal Application Access Token), T1078.004 (Valid Accounts: Cloud Accounts), and T1059 (Command and Scripting Interpreter).

The Authority Path That Failed

The identity carrying execution authority at the moment of failure was an internal agentic tool — a non-human identity whose intended job was to observe connector health and report failures. The scope it was meant to hold was read-and-report: watch infrastructure, surface problems. The scope it exercised, once the attacker had a foothold, was write-and-execute: it reached the automated remediation subsystem, accepted attacker-supplied tool definitions, and ran arbitrary code. Each newly registered tool widened the effective authority boundary because the sandbox treated registration as authorization.

The trust anchor that failed first was the missing authorization gate between tool registration and tool execution. The agent’s execution scope was defined by runtime-registered tools, not by a pre-approved manifest. An attacker who could register a tool could effectively redefine what the agent was permitted to do. A second anchor failed upstream: employee Gmail OAuth tokens, long-lived delegated credentials, enabled magic-link sign-in into auxiliary systems without a stronger barrier. Both gaps were inspectable before the incident: a monitoring tool acquiring code-execution tools is a held-versus-exercised mismatch, and OAuth access into privileged internal systems is a credential-scope finding.

SecurityV0 Perspective

This fits unproven_execution / ASI05. The defining failure is not initial access; it is that an agentic system executed code through tool definitions its operators never explicitly authorized. When an execution surface is assembled at runtime from registered tools, and registration is enough to make a tool invocable, “monitoring” and “arbitrary code execution” are separated by one unproven-execution gap. The 5,241 API keys in the adjacent cache show the blast radius of that gap.

The evidence pack SecurityV0 would produce enumerates, per agentic tool: every runtime tool definition, who or what registered it, when it became invocable, the approval artifact for each code-execution capability, the identity the tool runs under, and the credential stores reachable from its sandbox. Before exfiltration, that pack answers a deployment question: which tools can this system invoke right now, and did anyone approve the dangerous ones? After the fact, it answers the forensic question: which tool definitions appeared during the exposure window, which identity registered them, and which API keys and connected accounts were reachable when they ran?

What To Do

Gate tool registration behind explicit operator approval. Treat a newly registered tool definition as untrusted until an operator signs off. In agent runtimes that permit dynamic tool registration, deny-by-default any tool that can execute code, spawn processes, or reach credentials, and require an approval artifact before it becomes invocable.
Rotate Composio API keys and revoke connected-account tokens now. If you integrate Composio, rotate every API key, revoke the OAuth grants behind affected connectors — GitHub first, then Gmail, Jira, Slack, Notion, and the rest — and re-issue with the narrowest scope each agent actually needs. Treat keys created before the May 22 cutoff as burned.
Diff each agent’s runtime tool manifest against an approved baseline. Snapshot the set of tools every production agent exposes and alert on any tool that appears without a corresponding approval record — especially code-execution, shell, filesystem, or remediation tools. A new tool in the manifest should trigger review, not silent acceptance.
Kill magic-link paths into privileged planes. Staff access to staging, internal admin, and remediation systems should require phishing-resistant MFA and device binding, not a magic link backed by a reusable OAuth token. Treat long-lived employee OAuth tokens as non-human identities subject to short TTLs and rotation.
Isolate the credential store from the agent execution plane. API-key caches and secret stores must not be reachable from a tool-execution sandbox. Self-custody encryption keys through a KMS, issue per-call scoped credentials instead of long-lived keys, and apply inbound IP allowlists so a single compromised agent cannot freely drain the vault.

The Incident

The Authority Path That Failed

SecurityV0 Perspective

What To Do

Sources