2026 OpenClaw Gateway connection troubleshooting RPC probes · gateway closed(1000) · scope, tokens & sessions

OpenClaw has a common kind of false green: openclaw gateway status shows RPC checks as healthy, yet the client throws gateway closed (1000) or, after an upgrade, tools suddenly stop working. This article walks symptom → likely root cause → verification commands → fix, covering the three frequent failure classes—tokens, scopes, workspace paths, and model backends (including CLI-only routes that disable tools)—and gives a six-step recovery runbook plus the log lines that matter under systemd and Docker. Alongside the on-site production observability, security hardening, and cross-platform install articles, this one focuses on connection and session consistency: restore the path first, then hand long-term monitoring and change control to the observability guide.

01

Name the symptom: RPC OK does not mean session, scope, and model backend all match

Gateway control-plane probes often only answer whether the process is up and the port answers; client errors like gateway closed (1000) frequently follow a WebSocket or session closed by the server or auth and policy drift. Use the seven checks below on the front line: the more that apply, the less you should rely on refreshing the UI—run the ordered restart and config validation in section 3 instead.

  1. 01

    Treating probe green as end-to-end green: RPC OK in status is a narrow check; device-class commands and tool execution channels can still fail when scopes are missing or the session expired.

  2. 02

    Token drift: environment variables, config files, and the token loaded by the Gateway process are not the same copy; rotating secrets on only one side yields intermittent success and bulk failure.

  3. 03

    Workspace path mismatch: when agents.defaults.workspace points at an old directory or container bind mounts are wrong, the tool layer may refuse work or disconnect quickly.

  4. 04

    CLI-only model backends: some *-cli/... routes intentionally disable file-class tools, which looks like “Gateway online but tools unavailable” and is easy to confuse with closed(1000).

  5. 05

    Dual process after upgrade: the package updated but an old Gateway still holds the port or PID files were not cleaned; the new process is half-started and probes hit the old listener.

  6. 06

    Tightened security policy: after enabling dmPolicy / networkPolicy, the handshake may succeed and the first payload is dropped by policy; compare allowlists in the security hardening article.

  7. 07

    No minimal repro bundle: tickets with half a line of error and no CLI version, config snippet, or recent change force tier-two guesswork and stretch recovery time.

The shared root cause is compressing multi-layer health in a distributed system into a single boolean. Next, a table maps what you see to the commands you should run first so you are not swimming blindly in logs.

02

Symptom map: what you see → likely cause → checks to run first

Pin this table at the top of the on-call runbook: align on the exact string you see, then pick the shortest verification path. Exact subcommands depend on your OpenClaw build; the names below are illustrative of intent.

What you seeLikely root causeChecks to run first
RPC OK, but device or channel ops report closed(1000)Session scope does not match the action, or token differs from the Gateway runtimeopenclaw status --all; trace token sources; review allowlists in security config
After upgrade, “all tools grayed out”Model routing on a CLI-only backend, or Gateway not restarted to load new configopenclaw models list; switch off CLI-only routes, then openclaw gateway restart
Intermittent success, bulk failureMultiple terminals with different tokens, or a reverse proxy caching stale connectionsUnify env exports from one shell; clear client sessions; check proxy idle timeout
Path-class tools refuse to runWorkspace config does not match the real repo pathDiff openclaw config get agents.defaults.workspace against disk
Disconnects right after policy changedmPolicy / networkPolicy tightened; first packet rejectedRe-read the security hardening section; temporarily relax for a known session to validate

Probe green only proves the control plane is alive; to prove you can work reliably you must align token, workspace, model backend, and policy.

For fuller logging and rollback cadence see the production observability article: here the goal is to decide in about ten minutes whether you are on “restart + validate” or “config rollback.”

03

Six-step recovery runbook: from “connected” back to “stable tool use”

The order deliberately places low-cost steps first and config rollback later, so you do not open firewalls or reinstall immediately. In production, note in the ticket whether impact is “this CLI only” or “multi-user sessions.”

  1. 01

    Freeze concurrent work: ask teammates to pause new sessions and batch jobs so a Gateway restart is not drowned in reconnect storms.

  2. 02

    Capture a state snapshot: run openclaw --version, openclaw status --all (if available), and save output; record recent token rotation or openclaw.json edits.

  3. 03

    Validate workspace and model routing: confirm workspace points at a real directory; use openclaw models list to ensure you did not select a CLI-only backend by mistake.

  4. 04

    Run doctor / validate: use the CLI’s openclaw doctor, config:validate, or equivalent to fix obvious mismatches.

  5. 05

    Restart Gateway in order: openclaw gateway restart (or restart the systemd unit / container) so the old process exits before the new one listens.

  6. 06

    Minimal acceptance tests: one read-only tool call and one write call; only then reopen for others. If it still fails, go to section 4 for system logs.

bash · troubleshooting pipeline (illustrative)
openclaw --version
openclaw status --all 2>&1 | tee /tmp/openclaw-status.txt
openclaw config get agents.defaults.workspace
openclaw models list
openclaw doctor
openclaw gateway restart
# Then run one minimal read tool call and one write call to verify session and scope
info

Note: when the Gateway runs on a dedicated remote Mac, long SSH sessions and GUI prompts can still interrupt the toolchain; for stable unattended execution, pair with the directory and session isolation checklist in the agent node article.

04

systemd and Docker: stale processes, listeners, and the log lines that matter

If section 3’s restart still yields closed(1000), suspect first a process that never exited or drifted bind mounts inside a container. As in the observability article: establish who is listening and which user started it before debating config.

systemd (bare-metal Linux): use systemctl status to see whether the main process is crash-looping; journalctl -u <unit> -n 200 --no-pager for close codes and policy keywords. Docker Compose: align timestamps with docker compose ps and docker compose logs --tail=200 gateway. If you deployed with the Linux systemd + Tunnel guide, also confirm the tunnel and loopback binding are not pointing at a stale port.

warning

Warning: do not temporarily expose the Gateway to the public internet to “test connectivity” before you know which interface is listening; validate inside the constraints of the security hardening article so troubleshooting does not become an incident.

05

Still failing: minimal information bundle (for tier two or the vendor)

These fields shorten the second round of diagnosis; redact before sharing externally.

  • Version triple: CLI version, Gateway image digest or package version, OS minor version.
  • Timeline: first closed(1000) in UTC, plus prior changes (token, policy, upgrade).
  • Log excerpts: about 200 lines before and after Gateway start and disconnect, including session id if present and policy hit lines.

Running the Gateway only on a laptop is fragile to sleep, OS updates, and multi-user desktop sessions; a small Linux box often lacks the macOS toolchain and graphical edge cases you need. When OpenClaw must sit on a long-lived, contract-friendly execution tier, a dedicated remote Mac is usually steadier than repeatedly borrowing hardware. Compared to building your own Mac rack, NodeMini Mac Mini cloud rental makes it easier to define a repeatable node profile so “Gateway + toolchain” hands off like a VPS estate.

FAQ

Frequently asked questions

Probes cover a narrow path; session, scope, token, and model backend can still be out of sync. Follow section 3 for an ordered restart and run doctor/validate. To plan execution-tier capacity, see Mac Mini rental rates and the help center.

Check model routing for a CLI-only backend; confirm the Gateway restarted and loaded new config; then verify workspace paths and complete token rotation across CLI and service.

Open the blog OpenClaw category for install, systemd, Docker, security, and observability posts; cross-check connectivity baselines in the help center.