You clearly saw it Gateway is running、RPC detection is also green, but in Telegram/SlackThe news just doesn't come in, the Agent is like "disconnected but does not report an error". This article gives a piece of advice for operation and maintenancechannel layerTroubleshooting path: First use seven lists to remove the illusion that "the control surface is normal ≠ the message surface is normal", and then use oneSymptom vs root causeThe lookup table converges to pairing,dmPolicy, group chat mention and Bot permissions, finally givenSix-Step Minimal Recovery Runbook, and clearly related to the site not ready / startup stuck、gateway closed(1000)、Cross-platform installation The division of labor reading method.
OpenClaw putControl plane, session plane, channel plane, model backenddismantled at different levels; just look atopenclaw gateway statusIt is easy to misdiagnose the problem as "the model is broken". The following seven items are used for self-inspection before review to prevent the team from idling between the three types of logs.
Treat RPC OK as message link OK:RPC multi-validates native control plane reachability; DM/group routing also depends on pairing, webhook reachability, and policy hits.
Ignore Bot-side permission changes:After the channel administrator changes the permissions, the Bot is removed from the group, or the Token is rotated, the Gateway may still show running.
dmPolicy copy and paste is too strict:If the allowlist is mistakenly matched to an empty set or an old workspace, "healthy but all rejected" will appear; it should be matched withSecurity hardening Read in comparison.
Group chat does not meet mention gate control:When the group policy requires @Bot to respond, the user's verbal "I sent it" does not mean hitting the gating condition.
Treat MCP tool chain issues as channel issues:The symptoms of tool unresponsiveness are similar to those of messages not coming in; they should be ruled out first.MCP connectivity Go back to the channel.
After upgrading, only view the configuration but not the pairing status:The new version's stricter auth default may make pairing "semi-invalid"; you need to rerun pairing according to the official FAQ.
Multi-Gateway/Multi-Profile Drift:systemd and CLI read differentlyopenclaw.jsonWhen, "The green you checked is not the instance connected to the user" will appear.
The common root cause of these pitfalls is thataccessibilityandDeliverabilityConfused: the former answers "whether the process and port are alive", and the latter answers "whether this DM is allowed by the policy, whether it enters the session, and whether it is consumed by the model backend". After writing them into the ledger, use the next table to pin the symptoms to levels.
If you maintain both not ready and closed(1000)There are two sets of runbooks, please treat this article as the third volume: when there is still no news after excluding "Startup and Session", return to channel probes and strategies.
and Gateway security hardening Linkage: Tighten dmPolicyThe message entry will be significantly changed, and the change must be accompanied by canary and rollback instructions.
No silver bullet: you have to answer firstWhich layer is the message stuck on?, and then decide whether to change the configuration or permissions. During the review, clearly write down three SLAs: message inbound delay, failure explainability, and rollback time for policy changes.
| What you see | More likely root cause layer | Preferred verification |
|---|---|---|
| Gateway not ready / startup timeout | Startup and health check layer | read not ready troubleshooting;Look at the port, memory, compose startup sequence |
| RPC is green but the tool is abnormal / closed(1000) | Session, scope, Token, model backend | read closed(1000) articles;Alignment openclaw statuswith doctor |
| channels probe failed or channel disconnected | Channel connections and credentials | openclaw channels status --probe;Check Bot Token and webhook reachability |
| The probe is all green but still no inbound | Policy: dmPolicy / group chat gating / mention | control SafetyWith the minimum reproduction experiment in Section 4 of this article |
| Message comes in but Agent does not reply | Model-side quota, CLI-only, downstream timeout | openclaw models status;and modelRouting consecutive reading |
"Gateway normal" only indicatescontrol surface alive; What you want to buy isMessage deliverability: Matchmaking, strategy, and channel API capabilities must be on the same acceptance form.
If you run Gateway on Linux VPS, put the heavy tool chain inRemote Mac Exclusive Node, please write "Message Entry" and "Tool Execution" into two different duty runbooks: the former looks at channels and strategies, and the latter looks at SSH and resource levels.
and OpenClaw Category ListLinkage: Articles on installation, Docker, systemd, observation, and security should establish a common context in order to avoid "every article starts from scratch about what Gateway is."
The following sequence emphasizes "first global snapshot, then channel probe, then strategy and matching, and finally major surgery": Identical to the "First 60 seconds" of the official FAQ, but completeGroup Chat and dmPolicy Common blind spots.
Run overview:openclaw status, confirm that the OS, updates, Gateway are reachable, agents/sessions and provider prompts have no blocking items.
Run channel probe:openclaw channels status --probe, clear disconnected / auth errors first.
Column pairing:openclaw pairing list --channel telegram(Replace according to actual channel), handle pending/expired.
Comparison strategy:Review dmPolicy, group chat gating and mention rules are consistent with the duty schedule; back up before changingopenclaw.json。
Restart the Gateway and review:openclaw gateway restartThen repeat 01–02; if still abnormal, try again.openclaw doctor。
If it still fails, collect the minimum information packet:Version, related configuration fragments, and 50 lines of logs before and after (coding token) to facilitate secondary troubleshooting or community help.
openclaw status openclaw gateway status openclaw channels status --probe openclaw pairing list --channel telegram openclaw logs --follow openclaw doctor
hint:If you just changed gateway.bindOr reverse generation path, please check at the same timeSecurity hardeningThe combination of loopback and token can avoid "the control surface seems to be reachable, but the webhook is actually unreachable".
If "channels are all green but messages are still not coming in" appears after upgrading, give priority to suspicion.Pairing drift with auth default tightening: Rerun according to the official upgrade notesopenclaw gateway install --forceWith doctor, instead of rewriting the business prompt first.
and closed(1000)Article linkage: If there is a close frame in the log, you should first go back to the session layer to align the scope/Token, and then go back to the article channel strategy, otherwise the wrong layer will be changed repeatedly.
The most common type of production accident isThe policy is too strict + duty documents are not synchronized: Gateway logs are quiet and indicators are quiet, but the business side feels like "Agent is pretending to be dead". This section gives a minimal reproduction experiment: use it firstprivate chatVerify that the Bot is still available, and then switch back to the group chat to release door control one by one.
For channels such as Telegram/Slack, check three types of lists first:Is the bot still in the group?、Whether to require @mention、Whether to limit topics/threads; again dmPolicyThe allowlist is aligned with the workspace pathSafetyThe principle of minimum exposure.
Notice:Don’t directly “release dmPolicy” during peak periods to troubleshoot: open it firstread-only auditandshadow channel(Small account group/Test Bot), confirm that the rules are hit before converging on the production strategy.
If the Gateway runs on a VPS and the tool is executed onRemote Mac Exclusive Node, please review "message entry strategy" and "SSH execution strategy" separately: errors in the former will make users think that the entire site is down, while errors in the latter will only cause some tools to fail.
The following items are for internal alignment; specific thresholds will depend on the size of your channel and compliance requirements.
dmPolicy/Group gating class changes should be reserved for at leastA complete pairing returnandRollback backup, to avoid the evening rush hour, "change one line to be completely silent".Pure local or temporary VPS runs Gateway, often inSleep, port drift, certificate and IM platform current limitingRepeat between; and need to run stablyxcodebuild / CLI Agent / resident tool chainteam is more suitable forComputing power and desktop statusput inExclusive, long-term onlineOn the remote Mac node, leave the Gateway on the Linux machine to focus on messaging and orchestration. Compared with building your own scattered machines or cobbling macOS in an unstable virtualization environment,NodeMini’s Mac Mini Cloud RentalFixed SSH, clear disk files and replicable node images are more conducive to decoupling "tool execution" from "message entry"; when you need to compare specifications and prices, you can read firstRental Price Description, combined with Help Center Planning nodes.
When implementing, it is recommended to bind this runbook to the internal "Channel Change Level": Bot permissions, group policies, Gateway versions and model backend switching should use different approval and canary scopes.
The RPC multi-authentication control plane is reachable; whether messages enter the session also depends on channel pairing, Bot permissions, dmPolicy and group chat gating. Please follow the sequence of Section 3 of this articlechannels status --probe and pairing list. If you need node and network side suggestions, you can view them.Help Center。
The closed(1000) article focuses on session scope, token, and tool exceptions after upgrade; this article focuses on the channel layer message flow. If close frames appear frequently in the log, you should read two articles in succession: first exclude sessions and session backends, and then return to pairings and strategies. OpenClaw related articles can be found atBlog OpenClaw Filtering Enter.
A common topology is that the Gateway stays in the Linux VPS, and the remote Mac is used as an exclusive node to run CLI/build and heavy dependencies; the key is SSH and directory contracts, rather than tying IM and computing power to the same machine. Can be compared firstRental Price Description and Runner access Chapter planning capacity.