All posts
Supply Chain Compromise

codexui-android: A Refresh Token That Never Expires

Aikido disclosed an npm package that exfiltrated OpenAI Codex OAuth refresh tokens from ~/.codex/auth.json — and those tokens never expire

Securityv0 Intelligence Team OWASP: ASI06 sv0 finding: nhi_compromise
openai codex ai-coding-agent npm-supply-chain refresh-token nhi-compromise

The Incident

On May 27, 2026, Aikido Security researcher Charlie Eriksen disclosed that codexui-android — an npm package advertised as a remote web UI for OpenAI Codex, with roughly 27,000 to 29,000 weekly downloads — had been silently exfiltrating Codex authentication artifacts from ~/.codex/auth.json on every developer machine that installed it. The first published version, 0.1.72, landed on or around April 10, 2026 and behaved exactly as advertised. The pivot occurred at version 0.1.82, roughly one month later: the same publisher pushed a tarball containing exfiltration logic into the npm registry while leaving the public GitHub repository (friuns2/codex-mobile) clean. Any developer auditing the source on GitHub would have seen a legitimate-looking project; any developer who actually ran npm install codexui-android shipped their Codex credentials to an attacker-controlled server.

The payload read access_token, refresh_token, id_token, and account_id from ~/.codex/auth.json, XOR-encrypted the contents with the key anyclaw2026, base64-encoded the result, and POSTed it to https://sentry.anyclaw.store/startlog. The anyclaw.store domain was registered on April 12, 2026 — two days after the first benign npm version went live — strong WHOIS evidence that the publisher’s intent was in place from day one rather than the account being compromised later. The same operator extended the campaign onto Google Play with two Android apps that bundled the malicious npm build inside a PRoot Linux sandbox: OpenClaw Codex Claude AI Agent (package gptos.intelligence.assistant, 50,000+ installs) and a second Codex app published as codex.app by “BrutalStrike” (10,000+ installs). Both apps exfiltrated the same Codex auth artifacts to the same C2 endpoint. The npm publisher account, friuns (Igor Levochkin), is also the BrutalStrike Play Store identity. Aikido’s disclosure named the package, the C2 domain, the exfiltration endpoint, the XOR key, and the Android packages; no CVE was assigned because no software vulnerability was exploited — the package’s behavior was the attack. MITRE ATT&CK coverage: T1195.002 (Supply Chain Compromise: Compromise Software Supply Chain), T1552.001 (Unsecured Credentials: Credentials In Files), and T1550.001 (Use Alternate Authentication Material: Application Access Token).

The Authority Path That Failed

The identity that carried execution authority at the moment of failure was not the npm package and not the human developer — it was the OpenAI Codex OAuth refresh token sitting in plaintext at ~/.codex/auth.json on every Codex user’s workstation. That token’s held scope is everything the developer’s linked Codex account can do: read source attached to a Codex workspace, invoke model calls against the developer’s billing account, and, most importantly, mint new short-lived access tokens indefinitely. The refresh token does not expire, is not device-bound, has no proof-of-possession requirement, and is a plain bearer credential. A one-time read of auth.json by any process the user runs is enough to grant the reader persistent, silent, indefinite ability to impersonate that developer to OpenAI services — without ever touching the victim’s machine again.

Two trust anchors failed in sequence. The first was the assumption that the npm tarball is equivalent to the source visible on GitHub. npm publishes are uploads, not builds: the publisher decides what bytes get shipped, and no registry guarantee binds the published artifact to the visible repository. The second, and the one that turned a one-month dwell time into a persistent breach, was a credential design choice inside OpenAI Codex’s own client: a long-lived, bearer-only refresh token written to disk with no rotation, no token binding, and no consent gate on third-party reads. The gap between what codexui-android was scoped to do (render a UI) and what its identity-on-the-host was authorized to do (everything the Codex account could) was 100%, and that gap was visible to anyone enumerating which non-Codex processes had file-read access to ~/.codex/auth.json.

SecurityV0 Perspective

This is nhi_compromise. The exploitable asset was a non-human identity — the Codex OAuth refresh token — and the durable harm is not the npm tarball but the indefinite shadow access the stolen token now confers. The supply-chain mechanic was the delivery vehicle; the load-bearing failure is the credential design. The finding applies because the token’s held authority (account-scoped, mint-on-demand, non-expiring) was orders of magnitude larger than what any third-party tool needed to render a UI, and no system between the package and the OpenAI auth backend enforced that ceiling.

The evidence pack SecurityV0 would produce for this finding: a per-developer inventory of every long-lived AI-tool credential present on disk — ~/.codex/auth.json, ~/.config/anthropic/, ~/.gemini/, the equivalents for Cursor, Copilot CLI, and aider — with token type, scope, last-rotation timestamp, and expiry; the set of running processes (and their package provenance) that can read each artifact; and the delta between source-resolved SBOM entries (what GitHub shows for an installed npm package) and registry-resolved SBOM entries (what the actual tarball contains). Pre-incident, that pack answers two questions: which non-Codex processes have read auth.json in the last 30 days, and which installed npm dependencies do not hash-match their declared GitHub source. Post-incident, the same pack scopes the blast radius: which developer accounts had codexui-android installed at any point between April 10 and May 27, 2026, and which Codex API calls have been made from IPs or device contexts inconsistent with those developers since.

What To Do

  • Rotate every Codex credential that touched a machine where codexui-android was installed. Sign out and re-authenticate Codex on each affected workstation to invalidate the captured refresh token, then audit Codex API activity for the full window — from the first publish on or around April 10, 2026 through your rotation timestamp — for calls from unfamiliar IP ranges or unusual model-usage patterns. Treat the refresh token as compromised, not the access token, because the refresh token survives sign-in elsewhere until explicitly revoked.
  • Inventory and uninstall the named indicators of compromise. Remove the npm package codexui-android from every developer workstation and CI runner. On Android, remove gptos.intelligence.assistant (“OpenClaw Codex Claude AI Agent”) and codex.app (BrutalStrike “Codex”). Block the C2 host sentry.anyclaw.store and the parent domain anyclaw.store at your egress proxy and EDR. The XOR key anyclaw2026 is a useful detection string for memory and on-disk forensics.
  • Treat AI-tool credential files as a first-class detection surface. Add file-access monitoring for ~/.codex/auth.json, ~/.config/anthropic/*, ~/.gemini/*, Cursor’s local credential cache, and the Copilot CLI token store. Alert when a process that is not the corresponding first-party CLI reads any of these files. The same control would have caught this attack at install time without depending on registry telemetry.
  • Build SBOMs from registry artifacts, not from source. A source-resolved SBOM of codexui-android would have shown a clean GitHub repository; only a registry-resolved SBOM that hashed the actual npm tarball and diffed it against the declared source would have caught the divergence. For every npm and PyPI dependency in production, store the tarball hash, the published-from commit SHA, and the diff size between them — and alert on tarballs that do not correspond to a commit in the declared source repository.
  • Push your AI-tooling vendors toward short-lived, bound credentials. A non-expiring bearer refresh token sitting in plaintext on disk is the failure shape behind this incident; the same shape exists across most current AI developer tools. Ask your AI vendors — OpenAI, Anthropic, Google, Cursor, GitHub Copilot — for token-binding (RFC 8471 / DPoP), per-device proof-of-possession, short refresh-token lifetimes with reauthentication, and a documented revocation path. Until those land, every developer workstation is one stolen auth.json read away from an indefinite account takeover.

Sources