2026 OpenClaw Gateway startup troubleshooting Not ready · ports and memory · Docker / systemd logs · doctor checks

Onboard works, but the CLI or dashboard stays on Gateway not ready / not ready for a long time—that usually happens before the process is really listening, a different layer from the on-site gateway closed (1000) troubleshooting article (sessions, scopes, tokens). This post gives platform engineers the shortest path: clear ports, memory, timeouts, images, and volume permissions with a seven-item checklist, pick the right log surface with a bare-metal systemd vs Docker comparison table, then run a six-step runbook (including openclaw doctor and sample log commands), and read it alongside the install overview, Docker production, and observability guides.

01

Before startup finishes: seven hidden reasons the Gateway keeps spinning

not ready usually means the control plane has not yet seen a successful listen, dependent processes ready, or health probes passing; it is not the same as being kicked offline after a WebSocket is up by policy—use the closed (1000) article first for that. The seven items below are a first-week platform self-check after install.

  1. 01

    Port held by an old process or another service: after upgrades or repeated docker compose up, an old Gateway on the host can still bind the same port; the new process half-starts and the CLI only shows not ready.

  2. 02

    Insufficient memory and swap leading to OOM: on a small VPS pulling a model and the Gateway together, the Node process can be killed before readiness—logs often show exit code 137 or a silent disappearance.

  3. 03

    Startup timeout too short: cold pulls or native builds exceed the default startupTimeout, health checks fail early, and you see “forever not ready.”

  4. 04

    Reading the wrong log: when mixing systemd and Docker debugging, you tail the container but forget a host unit still launches an old binary—or the opposite.

  5. 05

    Named volume permissions and UID mapping: the container user cannot write state; the Gateway crash-loops and the outside world only sees not ready.

  6. 06

    Image digest vs config drift: compose points at :latest but cached layers disagree with new openclaw.json fields and the entry script exits early.

  7. 07

    Mistaking network issues for startup failure: model or plugin registry timeouts leave the process stuck in init—separate DNS/egress from the Gateway itself.

What these share: the process never reaches a servable state, so status/RPC can look “half green.” Put them in your ticket template, then use the table below to pick a log surface and stop bouncing between journal and docker logs.

Aligned with cross-platform install: that guide covers “it installs”; this one covers “it will not come up.” Aligned with observability: that guide covers long-term metrics and upgrade/rollback; this one covers hard failures before first readiness.

If you run the Gateway on a dedicated remote Mac or a small Linux VPS, put minimum memory and disk headroom in change review, not only version pins—or not ready spikes right before demos. The table below turns “which log do I read?” into a decision.

Finally: do not temporarily expose the Gateway to the public internet to “test connectivity” before you know what is listening; follow least exposure in security hardening.

02

Bare-metal systemd vs Docker: which log to read and which path to verify first

When debugging not ready, the first decision is entry shape: is the process started by systemd or by compose in a container? Mixing them wastes time.

DimensionLinux / macOS bare metal + systemd (or launchd)Docker / Compose
First log screenjournalctl -u <unit> -n 200 --no-pager or the platform equivalentdocker compose logs --tail=200 <service>, aligned to restart timestamps
Port conflict checksOn the host, lsof -nP -iTCP:<port> -sTCP:LISTEN (illustrative)Check listeners on both host and inside the container; published ports vs existing host services
Permissions / volumesState directory ownership vs runtime userNamed volume UID, read-only root vs writable mounts; see Docker production
Timeouts and retriesUnit TimeoutStartSec, Restart= policyHealthcheck start_period, retries, and image cold-start time
Upgrade / rollback hooksPackage version + last-known-good config backup pathPinned image digest + versioned compose files; same spirit as the observability rollback section

“Confirm listen, then talk sessions”: in the not ready phase the best ROI is ports + resources + the right log window, not changing tokens first.

If you deploy with Ubuntu 24.04 + systemd + Tunnel, also verify the tunnel upstream still points at the current port; a mismatch keeps the outside world not ready while curl 127.0.0.1 looks fine locally.

In Docker, a common misread is “container Exited but compose still looks stuck in creating”: use docker compose ps -a for exit codes, then return to volume permissions and the entry script.

Once the process is listening and probes pass, if clients still misbehave, switch to the session path in closed (1000).

03

Six-step runbook: from not ready to reliably accepting connections

The order below keeps cheap checks first: confirm resources and ports before you touch config and images. Exact subcommands depend on your OpenClaw build.

  1. 01

    Freeze concurrent retries: pause teammates’ auto-reconnect scripts so a reconnect storm does not bury your investigation.

  2. 02

    Capture version and resource snapshot: put openclaw --version, uname -a, and free memory/disk in the ticket.

  3. 03

    Verify ports and stale processes: run a listen check on the configured port; if you find an old PID, stop and start in order per systemd/Docker docs.

  4. 04

    Run doctor / validate: openclaw doctor or equivalent, fix obvious gaps, then retry startup.

  5. 05

    Widen the startup observation window: where supported, temporarily raise gateway.startupTimeout or compose healthcheck start_period to separate “slow” from “dead.”

  6. 06

    Minimal acceptance: after local curl health or the official status subcommand reports ready, restore other users; if it still fails, escalate with log samples from section 4.

bash · pre-start troubleshooting pipeline (illustrative)
openclaw --version
openclaw doctor
# systemd example:
# journalctl -u openclaw-gateway -n 120 --no-pager
# Docker example:
# docker compose logs --tail=120 gateway
# then restart the unit / compose service per distro docs
info

Note: if CLI logs show model or plugin download timeouts, separate egress from the Gateway itself; during a maintenance window you can temporarily widen egress allowlists, then tighten again—see security hardening.

Aligned with install overview: if you just changed Anthropic billing or API key shape, keep env vars and config files in one source of truth or the process can block parsing keys and look like not ready.

On a dedicated remote Mac running the Gateway long term, put launchd/unit restart policy and disk cleanup in the same runbook so full disks do not cause a loop of not ready.

04

Symptom map: common not ready phrases → do these three steps first

Pin this “plain-language description → action” map in on-call chat to cut noise. If symptoms already involve WebSocket close codes, switch to closed (1000).

Port already in use: stop the old process before starting the new one; do not keep scaling without freeing the port. Out of memory: lower concurrency or add swap/upgrade before tuning knobs. Image pull failed: run docker pull or fix registry credentials before touching Gateway config.

warning

Warning: do not run docker system prune-style cleanup in production without a paper trail; you can delete named volume backups still in use. Put cleanup paths in the change record.

With observability: after not ready is fixed, tag the root cause on your dashboard (port/memory/image/volume) so the same class does not reopen next week.

If the Gateway runs with MCP child processes on the same host, startup must also cover child binary paths and sandbox mounts visible inside the container—or the parent blocks in tool discovery.

05

Still failing: minimum packet for tier-two (quotable)

Use these fields to align across teams; redact before sharing externally.

  • Deployment topology: bare-metal systemd / Docker Compose / other, unit or compose service name, listen address (loopback or behind a reverse proxy).
  • Resource curve: memory peaks during the not ready window, whether OOM fired, and whether free disk fell under the 20% safety line.
  • Log window: about 120 lines before and after startup failure from journal or compose logs, plus CLI version and image digest when applicable.

Running the Gateway only on a personal laptop is fragile to sleep and OS updates; a tiny-memory VPS often OOMs during cold start. When you need a long-lived, contract-friendly macOS execution tier for OpenClaw and its toolchain, a dedicated remote Mac is usually steadier than borrowing a laptop. Compared to ad-hoc hardware, NodeMini Mac Mini cloud rental gives fixed SSH and clear disk tiers so “Gateway + toolchain” can hand off like a node in an estate. Specs and pricing: Mac Mini rental rates; onboarding questions: help center; start the OpenClaw series from the blog category.

FAQ

FAQ

Use this article when the process is not listening or you suspect ports, memory, or timeouts; use closed (1000) when a WebSocket was up and policy or session closed it. For capacity and onboarding, see Mac Mini rental rates and the help center.

docker compose logs --tail=120 gateway plus host dmesg / resource snapshot for the same time window; if you suspect volume permissions, add docker compose ps -a exit codes.

Open the blog OpenClaw category for install, Docker, systemd, security, observability, and MCP posts; this article fits the “will not start” phase.