If you can install OpenClaw but do not want the Gateway bound straight to a public interface, you often stall on Ubuntu 24.04–class VPS: which listen address to choose, how systemd should restart predictably, and how Cloudflare Tunnel (or an equivalent) exposes only a loopback port behind a controlled hostname. This article complements the Docker production and multi-platform install guides with a bare metal + system service + tunnel path: environment baseline, Gateway configuration, example unit file, tunnel routing, and symptom-based troubleshooting, plus a typical topology with a stable remote Mac as the execution layer.
The multi-platform install guide focuses on getting something running; the Docker production guide fits teams already standardized on containers and Compose-orchestrated delivery. This article is for readers who want the Gateway on bare-metal Ubuntu 24.04 with minimal abstraction, systemd for crash recovery and boot, and Cloudflare Tunnel (or a self-managed reverse proxy) to strip TLS and access control out of the app. The three paths are not mutually exclusive—you can prove it on bare metal first, then containerize.
The six pain points below are the most common “mental model gaps” we see in support tickets. Align them before the checklist in later sections matters.
Treating “curl works” as “production safe”: Listening on 0.0.0.0 without a front proxy pushes auth and TLS onto application defaults; one config drift can invite whole-internet scanning.
Ignoring Node runtime coupling with glibc: Mixing stale prebuilt binaries or wrong NODE_OPTIONS on Ubuntu 24.04 yields “works locally, never comes up under systemd” false positives.
Casually using /root for workdirs and permissions: After upgrades or logrotate changes ownership, the Gateway may fail to read config and systemd enters a tight restart loop while logs drown in noise.
Health checks that only test the process: Without HTTP readiness and upstream model connectivity, load balancers still send traffic while the tunnel returns 502 and you debug the wrong layer.
Tunnel up but ingress points at the wrong port: When cloudflared ingress and the real Gateway listen address diverge, DNS resolves but the loopback port never matches.
Using the Linux gateway as a universal executor: Forcing Xcode, simulators, or Apple Silicon–specific work onto a VPS masquerades topology mistakes as “OpenClaw is unstable.”
After this section you should know whether “loopback + tunnel” is your line; if yes, continue with the environment baseline and commands below.
Do not optimize only for “fast install.” Put attack surface, observability, and rollback on the same row. Use the table to align reviews; substitute real ports and image tags from your environment.
| Dimension | Gateway on public interface | Docker / Compose production | systemd + loopback + Tunnel |
|---|---|---|---|
| Exposure | Largest; relies on app-layer TLS and firewall rules | Determined by published ports and network mode; easy to misconfigure host networking | Process listens on 127.0.0.1 only; public entry is the tunnel or edge |
| Certs and DNS | Self-managed ACME or manual rotation | Often fronted by Traefik / Caddy or a cloud LB | TLS terminated at Cloudflare edge (or equivalent) |
| Restarts and debugging | Depends on external supervisors or manual ops | Restart policies and log drivers need explicit design | systemd restart policies and journal ordering are straightforward |
| Best fit | Minimal PoC; not recommended for production | Teams with container standards and image supply chains | Single VPS, small teams that want scriptable reproducibility |
The production question is not “can it be reached,” but “who is authorized on which path”—shrinking the listen address to loopback moves that problem to the edge, where it belongs.
On Ubuntu 24.04 LTS, follow the current Node.js LTS major (as of early 2026 often 22.x or whatever nodejs.org recommends). nvm, fnm, or NodeSource are all fine, but pin the minor version in production and document upgrade windows. Distribution packages for node are usually too old to be the sole source for the Gateway.
Firewall baseline: SSH key-only auth, ufw default deny incoming, and do not open public rules for the Gateway service port—public access goes through the tunnel only. For rare direct debugging, use time-bounded allowlists or a jump host and record when access must be revoked.
Dedicated user and directories: e.g. user openclaw, runtime state under /var/lib/openclaw, config under /etc/openclaw (files 640, directories 750), not mixed under root.
Pinned Node and lockfiles: Keep package-lock.json or pnpm-lock.yaml in repo or on the deploy host; use npm ci or equivalent in production to avoid accidental major upgrades.
Validate glibc and native addons: If the Gateway pulls native modules, run a clean install and a smoke script on the target host to rule out dlopen failures.
Timezone and log timestamps: Set timedatectl to UTC or a team-standard zone so tunnel and edge logs correlate.
Enable ufw baseline: ufw allow OpenSSH then ufw enable; keep service ports closed and verify with ss -tlnp that the Gateway is not reachable from the public side.
Reserve a health endpoint: Before attaching the tunnel, document curl -fsS http://127.0.0.1:PORT/health (path per your config) in the runbook and align it with ExecStartPost or external probes.
Note: OpenClaw CLI flags and config filenames follow upstream documentation; paths and ports in the unit and YAML below are placeholders—adjust to your install before enabling.
Listen address: Prefer 127.0.0.1:PORT in production; add a Unix domain socket if your build supports it. Secrets: inject via an environment file; never world-readable config. Rotate by writing the new token first, rolling restarts, then removing the old token. Health: cover at least process liveness plus either HTTP readiness or model control-plane reachability—otherwise the tunnel only says “unreachable,” not “wrong layer.”
[Unit] Description=OpenClaw Gateway (loopback) After=network-online.target Wants=network-online.target [Service] User=openclaw Group=openclaw WorkingDirectory=/var/lib/openclaw EnvironmentFile=-/etc/openclaw/gateway.env # Replace ExecStart with the real command from upstream docs ExecStart=/usr/bin/node /opt/openclaw/gateway.mjs --config /etc/openclaw/gateway.yaml Restart=on-failure RestartSec=5 LimitNOFILE=1048576 # Logging: prefer journal; for files, pair with logrotate and permissions StandardOutput=journal StandardError=journal [Install] WantedBy=multi-user.target
Suggested order: systemctl daemon-reload → systemctl enable --now openclaw-gateway → systemctl status until the main process is stable → local curl health check → then start cloudflared.
tunnel: <YOUR_TUNNEL_ID>
credentials-file: /etc/cloudflared/<YOUR_TUNNEL_ID>.json
ingress:
- hostname: claw.example.com
service: http://127.0.0.1:8787
- service: http_status:404
Warning: Tunnel credential JSON must be readable only by root or a dedicated user; never bake credentials into app image layers or public CI logs. If you terminate TLS with self-managed Nginx/Caddy instead of Cloudflare, keep the same pattern: TLS at the edge, upstream loopback only.
Log triage order: ① journalctl -u openclaw-gateway -b --no-pager for crash loops and env loading; ② application log directory if any; ③ journalctl -u cloudflared or equivalent tunnel service; ④ curl -v from localhost and from outside to separate process issues from routing issues.
Port errors: For EADDRINUSE, use ss -tlnp to find the holder and rule out zombie processes; after changing ports, sync tunnel ingress and internal probes. Permission errors: For EACCES, check owner/group on config and key material; the systemd User must match a readable identity. Tunnel issues: If the public URL returns 502 but localhost curl works, inspect ingress, whether the tunnel process is connected to Cloudflare, and DNS to the correct zone. Model connectivity: On timeouts or 401 in Gateway logs, curl the vendor baseline API from the server (avoid putting secrets in shell history) to rule out blocked egress IPs and expired tokens.
A common split: Linux VPS hosts the public Gateway, queues, and light orchestration; a stable remote Mac (physical Apple Silicon) runs Xcode builds, simulators, and macOS-only toolchains. The Gateway dispatches heavy work to the Mac layer—this preserves “no public bind on the gateway” without forcing ill-suited workloads onto Linux. Treat the Mac as an orderable execution node and Linux as your self-managed control plane; link them with SSH, a queue, or a private tunnel, but isolate failure domains: gateway outages should not wipe Mac build caches, and Mac maintenance should not take down the whole API front door.
127.0.0.1 plus one port in production to avoid accidental dual-stack binds on all interfaces.Restart=on-failure with RestartSec=5 yields more readable logs than tight restart storms; add StartLimitIntervalSec when needed.http://127.0.0.1:PORT over localhost to avoid resolving to IPv6 ::1 while the app binds IPv4 only.A lone Linux VPS plus tunnel fits control planes and API aggregation, but iOS CI/CD, simulators, Metal, and the Xcode stack hit hard walls: either accept headless-remote compatibility trade-offs or spend ongoing effort on images and maintenance. Putting heavy work on a dedicated remote Mac with predictable maintenance windows while keeping systemd and tunnels on the gateway side clarifies blast radius. For teams that need long-running agents and build pipelines, NodeMini cloud Mac Mini rental is usually the better fit—keep the control plane on your VPS and align execution with Apple-native hardware and rental terms.
In production, strongly prefer loopback-only listening and let the tunnel or reverse proxy handle TLS and access policy. If you must expose a port publicly, add firewall allowlists, rate limits, and auditing; that is usually more complex than a tunnel. More OpenClaw posts are in the OpenClaw column.
Curl the loopback URL on the server to confirm the Gateway is ready; then check tunnel ingress and credentials; only then suspect upstream models or DNS. Before ordering execution nodes, review the rental pricing overview for term and region alignment.
Search the help center for SSH, VNC, and common failures; the blog index is on the blog home.