If your Claude / GPT / Cursor setup can chat but cannot query databases, read files, or call APIs — you do not need a longer prompt. You need a reusable tool layer. This guide is for backend and full-stack developers. It walks from Hello World to a ChromaDB knowledge-base MCP Server, covering Tools, Resources, and Prompts, stdio and HTTP+SSE transport, debugging, testing, and Docker production deployment. When you finish, you will have custom tools callable from Cursor and a clear path to running the Server on a dedicated remote Mac 24/7 (for protocol background, start with our MCP protocol guide).
Large models have training cutoffs and cannot reach your CRM, Git repos, or internal APIs. Before 2024, you wrote Function Calling for Claude, Plugins for GPT, and a different format for Cursor — switch models, start over. An MCP Server wraps tool capabilities in an independent process: write once, use across Claude Desktop, Cursor, and Gemini.
Typical scenarios: query Postgres sales data from Claude Desktop; have a Cursor Agent read project docs and edit code; call your ticketing system over HTTP MCP from GPT — all backed by the same Server.
What this guide delivers: not theory alone, but a path from say_hello to a production-grade knowledge-base Server with vector search. Target reader: developers with Python or TypeScript basics who want to extend AI in their IDE or Desktop app.
Vendor lock-in: OpenAI Function Calling and Claude Tool Use use different formats — every vendor switch means rewriting the adapter layer.
Tools cannot be discovered: REST APIs rely on static docs; AI cannot call tools/list at runtime to discover capabilities on its own.
IDE silos: Cursor, VS Code extensions, and JetBrains plugins each define tools differently — you maintain N×M integrations.
Context and data split: LLMs cannot reliably read config files, user preferences, or live logs — you need standardized read-only Resources.
Scattered prompt templates: code review and incident report templates have no unified registry; teams copy-paste independently.
Local vs remote deployment chaos: stdio subprocesses suit development, but production HTTP gateways, auth, and monitoring lack a shared pattern (see our stdio subprocess management guide).
"Connecting an MCP Server to AI is like installing IDE plugins for a programmer — capability jumps from chat to operating on the real world."
Evolution path: Function Calling (2023) → ChatGPT Plugins → MCP (Nov 2024, Anthropic open source). Anthropic designed MCP as the "USB-C" between AI and the outside world: the Host (Cursor / Claude Desktop) embeds an MCP Client and opens a 1:1 session with your MCP Server.
Communication uses JSON-RPC 2.0: initialize → tools/list / tools/call → resources/read. Two transport lifecycles:
Full spec at modelcontextprotocol.io.
| Dimension | MCP | OpenAI Function Calling | LangChain Tools |
|---|---|---|---|
| Openness | Cross-vendor open protocol, AAIF governance | Tied to OpenAI API | Framework abstraction, not a transport standard |
| Discovery | Runtime tools/list | Inline functions array in request | Code registration, no standard discovery |
| Read-only data | Resources + URI scheme | No first-class equivalent | Retriever concept, not protocol-level |
| Prompt templates | Prompts standard interface | None | PromptTemplate class |
| Transport | stdio / HTTP+SSE / Streamable HTTP | HTTPS API bundled | Depends on Agent runtime |
| Reusability | One Server serves Cursor + Claude + Gemini | OpenAI ecosystem only | Cross-framework rewrites required |
Two main paths: Python mcp + FastMCP (data/script friendly) and TypeScript @modelcontextprotocol/sdk (Web/API integration, type safety). SDK repos: python-sdk, typescript-sdk.
# Python 路线 python -m venv .venv && source .venv/bin/activate pip install "mcp[cli]" httpx pydantic # TypeScript 路线 npm init -y && npm install @modelcontextprotocol/sdk zod npm install -D typescript tsx @types/node
my-mcp-server/ ├── pyproject.toml # 或 package.json ├── src/ │ ├── server.py # FastMCP 入口 │ ├── tools/ # 各工具模块 │ ├── resources/ # Resource 提供者 │ └── prompts/ # Prompt 模板 ├── tests/ │ └── test_tools.py # pytest + ClientSession ├── Dockerfile └── README.md
Pick a stack: data/ML teams prefer Python; Node full-stack teams prefer TypeScript.
Create a virtual env and lock dependencies: pip freeze or package-lock.json to prevent schema drift.
Install MCP Inspector: npx @modelcontextprotocol/inspector for visual JSON-RPC debugging.
Configure Claude Desktop: edit ~/Library/Application Support/Claude/claude_desktop_config.json and add Server command/args.
Configure Cursor: Settings → MCP → Add Server; for stdio use python -m src.server or an absolute path.
Verify Inspector connectivity: start Server → connect Inspector → confirm tools/list returns a non-empty list.
// Cursor / Claude Desktop MCP 配置示例
{
"mcpServers": {
"my-tools": {
"command": "python",
"args": ["-m", "src.server"],
"env": { "API_KEY": "your-key" }
}
}
}
Use FastMCP to build a say_hello tool and validate the full chain: code → Inspector → Cursor.
# src/server.py
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("hello-server")
@mcp.tool()
def say_hello(name: str = "World") -> str:
"""向指定对象打招呼"""
return f"Hello, {name}! MCP is working 🎉"
if __name__ == "__main__":
mcp.run() # 默认 stdio 传输
# 用 Inspector 调试 npx @modelcontextprotocol/inspector python -m src.server # 或直接 stdio 启动 python -m src.server
After adding the same command in Cursor, ask the Agent to "use say_hello to greet NodeMini." If you get a JSON result, Client ↔ Server handshake succeeded.
Tip: FastMCP auto-generates JSON Schema from function docstrings and type annotations — no hand-written parameter descriptions needed.
Tools are MCP's core capability: AI executes side-effect operations via tools/call. Each Tool exposes a name, description, and inputSchema; FastMCP uses Pydantic models for validation.
from pydantic import BaseModel, Field
class SearchInput(BaseModel):
query: str = Field(..., description="搜索关键词")
limit: int = Field(10, ge=1, le=100, description="返回条数")
@mcp.tool()
async def search_docs(params: SearchInput) -> str:
"""在文档库中搜索"""
results = await index.search(params.query, params.limit)
return json.dumps(results, ensure_ascii=False)
ast.literal_eval; never exec.import httpx
from datetime import datetime, timezone
@mcp.tool()
async def fetch_url(url: str) -> str:
"""HTTP GET 获取 URL 内容(域名白名单内)"""
allowed = ("api.github.com", "nodemini.com")
if not any(url.startswith(f"https://{d}") for d in allowed):
raise ValueError(f"Domain not allowed: {url}")
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(url)
resp.raise_for_status()
return resp.text[:8000]
@mcp.tool()
def get_current_time() -> str:
"""返回当前 UTC 时间"""
return datetime.now(timezone.utc).isoformat()
raise ValueError("human-readable msg"); the Client passes the message back to the LLM.retryable; fail fast on permission denied.idempotency_key to prevent duplicate Agent calls from corrupting data.Tool vs Resource: Tools have side effects and are invoked by AI; Resources are read-only context that the Host can inject before a conversation or that AI pulls via resources/read. URI schemes are custom — e.g. config://, user://, file://.
@mcp.resource("config://app/settings")
def app_settings() -> str:
"""静态应用配置(text/plain)"""
return open("config/settings.json").read()
@mcp.resource("user://{user_id}/profile")
def user_profile(user_id: str) -> str:
"""动态用户 profile(application/json)"""
return json.dumps(get_user(user_id))
| MIME Type | Use Case | Example |
|---|---|---|
| text/plain | Logs, README | file://logs/app.log |
| application/json | Config, API responses | config://env |
| application/octet-stream | Binary (base64) | PDF summary |
| text/event-stream | Live subscription | Log tail, metrics stream |
Filesystem Resource Server pattern: implement resources/list to scan directories, resources/read to fetch by URI, and resources/subscribe to watch file changes via watchfiles and push updates — ideal for exposing codebase docs to a Cursor Agent.
An MCP Prompt is a conversation skeleton registered on the Server. The Client fetches a message list via prompts/get, with user / assistant roles and parameter placeholders — teams share code review and incident workflows without maintaining per-user prompt files.
from mcp.types import PromptMessage, TextContent
@mcp.prompt()
def code_review_prompt(language: str = "python") -> list[PromptMessage]:
"""标准化 Code Review 多轮模板"""
return [
PromptMessage(role="user", content=TextContent(
type="text",
text=f"你是资深 {language} 工程师。请按安全、性能、可读性三维审查以下 diff。"
)),
PromptMessage(role="assistant", content=TextContent(
type="text",
text="请粘贴 diff 或指定 PR 编号,我将按 CHECKLIST 输出结构化 review。"
)),
]
Multi-turn templates can nest variables ({ticket_id}, {severity}). The Server owns version management; when the Client upgrades the Server, the whole team gets the latest review standards.
| Dimension | stdio | HTTP + SSE / Streamable HTTP |
|---|---|---|
| Deployment | Local subprocess, Host-spawned | Standalone service, URL connection |
| Scaling | Single machine, hard to scale out | Load balancing, multiple replicas |
| Auth | Relies on Host env vars | Bearer Token / API Key / mTLS |
| Debugging | Inspector direct connect | curl + SSE client |
| Best for | Personal dev, local Cursor | Team sharing, SaaS integration |
# Streamable HTTP 模式(FastMCP 2026)
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("prod-server", host="0.0.0.0", port=8080)
# ... 注册 tools ...
if __name__ == "__main__":
mcp.run(transport="streamable-http")
Production must-haves: Bearer Token validation middleware, CORS whitelist (allow only your Host domains), rate limit (e.g. 100 req/min/IP), HTTPS terminated at a reverse proxy. Remote gateway ops details in our HTTP gateway management guide.
Warning: never expose HTTP MCP to the public internet without auth — in 2026, many Servers are still found unauthenticated. Always add auth and IP restrictions.
MCP Inspector is the official visual debugger: connect via stdio or URL, manually send tools/list and tools/call, inspect JSON-RPC round trips — far faster than guessing from Cursor logs.
# tests/test_tools.py
import pytest
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
@pytest.mark.asyncio
async def test_say_hello():
params = StdioServerParameters(command="python", args=["-m", "src.server"])
async with stdio_client(params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
result = await session.call_tool("say_hello", {"name": "MCP"})
assert "MCP" in result.content[0].text
| Symptom | Cause | Fix |
|---|---|---|
| Server exits immediately after start | stdout polluted by print | Log to stderr; never print to stdout |
| tools/list is empty | Decorator not registered or wrong import order | Ensure @mcp.tool() runs before run() |
| Cursor shows disconnected | Wrong command path or venv not active | Use absolute paths; put full python path in config |
| JSON-RPC parse error | Non-JSON output on stdio | Disable debug banners; set library log level to WARNING+ |
FROM python:3.12-slim WORKDIR /app COPY pyproject.toml . RUN pip install --no-cache-dir . COPY src/ src/ EXPOSE 8080 HEALTHCHECK CMD curl -f http://localhost:8080/health || exit 1 CMD ["python", "-m", "src.server", "--transport", "streamable-http"]
Platform choices: Railway / Render for quick validation; AWS ECS / GCP Cloud Run for enterprise compliance; VPS + Docker Compose lowest cost but self-managed patching.
protocolVersion at initialize — Server should declare compatible range./metrics (tool call QPS, P99 latency); Sentry for unhandled exceptions; /health for K8s liveness.Vectorize internal Wiki / Markdown docs and expose index_document, search_knowledge, and write_note tools so a Cursor Agent can "search the company knowledge base before writing code."
Requirements: incremental indexing (watchfiles on docs/), semantic search Top-K, optional scratchpad writes.
import chromadb
from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction
client = chromadb.PersistentClient(path="./data/chroma")
collection = client.get_or_create_collection(
"wiki", embedding_function=SentenceTransformerEmbeddingFunction()
)
@mcp.tool()
def search_knowledge(query: str, top_k: int = 5) -> str:
"""语义搜索内部知识库"""
hits = collection.query(query_texts=[query], n_results=top_k)
return json.dumps(hits["documents"][0], ensure_ascii=False)
@mcp.tool()
def index_document(path: str) -> str:
"""索引单个 Markdown 文件"""
text = open(path).read()
collection.upsert(ids=[path], documents=[text], metadatas=[{"path": path}])
return f"Indexed: {path}"
Cursor demo query: "Search the knowledge base for MCP HTTP deployment docs, then summarize a three-step launch checklist" — the Agent calls search_knowledge first, then answers from retrieved context. Swap the vector store for Qdrant (remote gRPC) at larger scale.
Official and community Servers are ready to use — no need to rebuild everything:
2026 trends: MCP Marketplaces emerging, OAuth 2.1 tool authorization on the spec roadmap, Streamable HTTP gradually replacing pure SSE. Learning path checklist: ① read spec → ② Hello World → ③ write 3 Tools → ④ add Resource → ⑤ pytest → ⑥ Docker deploy → ⑦ connect Cursor.
From say_hello to a ChromaDB knowledge base, you now have full-stack MCP Server skills: Tools execution, Resources context, Prompts templates, dual transport, testing, and production ops. Next step: fork a community Server or wrap your company APIs as a team-standard tool layer.
Local stdio suits personal experiments, but multiple parallel Servers, persistent vector indexes, and HTTP long connections will push a 16GB laptop into frequent swap. Cheap Linux VPS hosts struggle with macOS-only toolchains. Self-built HTTP gateways without session affinity and auth often suffer connection leaks and unauthorized exposure — long-term stability rarely matches expectations.
For teams treating MCP as production infrastructure while running Cursor Agents and iOS/macOS CI, hosting MCP Servers on a dedicated cloud Mac 24/7 is usually more predictable than betting on a local laptop or generic VM. NodeMini Mac Mini cloud rental works as the MCP + Agent execution layer: SSH nodes and Server config stay unchanged when you swap underlying LLMs. See rental pricing for specs and Help Center for onboarding.
Python FastMCP is the fastest path for data and script tools; TypeScript SDK offers type safety and seamless Node integration. Both are fully protocol-compatible. To run multiple Servers 24/7, see rental pricing for remote Mac configurations.
Function Calling is tied to OpenAI; MCP is an open protocol across Claude, GPT, Gemini, and Cursor with Resources and Prompts. Background in our MCP protocol guide.
Lightweight stdio runs locally; multiple Servers + vector store + HTTP long connections benefit from a dedicated remote Mac. Onboarding steps in the Help Center; ops details in our stdio subprocess management guide.