Skip to content

feat(server): add host process watchdog to StdioServerTransport#1712

Open
travisbreaks wants to merge 2 commits intomodelcontextprotocol:mainfrom
travisbreaks:feat/server-orphan-detection
Open

feat(server): add host process watchdog to StdioServerTransport#1712
travisbreaks wants to merge 2 commits intomodelcontextprotocol:mainfrom
travisbreaks:feat/server-orphan-detection

Conversation

@travisbreaks
Copy link
Copy Markdown

@travisbreaks travisbreaks commented Mar 19, 2026

Summary

  • Adds clientProcessId option to StdioServerTransport that enables a watchdog timer
  • Watchdog periodically checks if the host process is alive using signal 0 (process.kill(pid, 0))
  • Self-terminates the transport when the host is gone, preventing orphaned server processes
  • Configurable check interval via watchdogIntervalMs (default: 3s)
  • New StdioServerTransportOptions interface with backwards-compatible constructor overloads
  • Timer uses unref() so it does not prevent clean process exit

Follows the same pattern used in vscode-languageserver-node as suggested in the issue.

Closes #208

Test plan

  • should close transport when host process is gone: watchdog detects dead PID and closes
  • should not close when host process is alive: no false positives with own PID
  • should stop watchdog on close: cleanup on transport close
  • should accept options object constructor: new constructor form works
  • All existing stdio tests continue to pass (7 total)
  • Build succeeds
  • Lint passes

Changeset

@modelcontextprotocol/server: minor (new opt-in feature, no breaking changes)

Co-Authored-By: Claude Opus 4.6 (1M context) tadao@travisfixes.com

@travisbreaks travisbreaks requested a review from a team as a code owner March 19, 2026 16:12
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Mar 19, 2026

🦋 Changeset detected

Latest commit: 513efd2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
@modelcontextprotocol/server Patch
@modelcontextprotocol/express Patch
@modelcontextprotocol/fastify Patch
@modelcontextprotocol/hono Patch
@modelcontextprotocol/node Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Mar 19, 2026

Open in StackBlitz

@modelcontextprotocol/client

npm i https://pkg.pr.new/@modelcontextprotocol/client@1712

@modelcontextprotocol/server

npm i https://pkg.pr.new/@modelcontextprotocol/server@1712

@modelcontextprotocol/express

npm i https://pkg.pr.new/@modelcontextprotocol/express@1712

@modelcontextprotocol/fastify

npm i https://pkg.pr.new/@modelcontextprotocol/fastify@1712

@modelcontextprotocol/hono

npm i https://pkg.pr.new/@modelcontextprotocol/hono@1712

@modelcontextprotocol/node

npm i https://pkg.pr.new/@modelcontextprotocol/node@1712

commit: 513efd2

Copy link
Copy Markdown
Contributor

@felixweinberger felixweinberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Please pull in a few things before merge (cherry-picks from #1815, credit @windro-xdd):

  1. Rebase onto main. Stale base conflicts with the _closed guard and _onstdouterror additions.
  2. EPERM handling. In the watchdog catch, treat err.code === 'EPERM' as still-alive.
  3. Workerd shim. Add a kill(): never stub to packages/server/src/shimsWorkerd.ts.
  4. Export StdioServerTransportOptions from packages/server/src/index.ts.

@travisbreaks travisbreaks force-pushed the feat/server-orphan-detection branch from e253410 to 7d89648 Compare April 2, 2026 05:36
@felixweinberger
Copy link
Copy Markdown
Contributor

@claude review

Comment thread packages/server/test/server/stdio.test.ts
Copy link
Copy Markdown
Contributor

@felixweinberger felixweinberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking through this more deeply - while this is technically correct and would work, I wonder what the actual mechanism for using this would be without a change to the MCP spec.

By design this requires client and server to somehow agree how to exchange the processId. Because the spec doesn't have a standard method for it (LSP does, which is what the issue refers to as an example), it implicitly means client and server both need to be controlled by the developer 🤔 - it could still be useful for someone that does this but does raise the question of whether that justifies introducing new APIs for a transport that's probably less used than SHTTP. Without a lot of people calling for this I'd err on the side of not expanding the API surface unless we're sure people want this and run into this problem.

I also had a look at other SDKs and none currently do anything like this watchdog setup (not necessarily making this wrong, but a data point for demand).

However, the close problem does seem real, and there's something the other SDKs do that TS does not - that's handling stdin-EOF better.

I think we could probably get a lot of mileage if we just do something like this alongside the data/error listeners:

  1. Handler field (near _ondata):
_onend = () => void this.close().catch(err => this.onerror?.(err));
  1. In start():
this._started = true;
this._stdin.on('data', this._ondata);
this._stdin.on('error', this._onerror);
this._stdin.on('end', this._onend);          // ← add
this._stdout.on('error', this._onstdouterror);
  1. In close() (near line 86, alongside the other .off() calls):
this._stdin.off('data', this._ondata);
this._stdin.off('error', this._onerror);
this._stdin.off('end', this._onend);         // ← add

We currently don't listen for the end event but the other SDKs all do, so we'd close that gap here.

Would you be open to simplifying to this here and instead of testing the Watchdog setup testing process cleanup on an end event?

@travisbreaks
Copy link
Copy Markdown
Author

Thanks for the direction here. The stdin EOF approach is cleaner and I'm happy to rewrite to match what the other SDKs do.

One question before I go: does stdin end reliably fire when the host process is kill -9'd? The watchdog with process.kill(pid, 0) explicitly handles that case (OS-level process existence check), but if the pipe teardown on SIGKILL is reliable across platforms, the simpler approach covers everything.

Either way, I'll rewrite to stdin EOF. Just want to understand if there's an edge case worth documenting or handling as a follow-up.

@travisbreaks travisbreaks force-pushed the feat/server-orphan-detection branch 2 times, most recently from 6abe472 to 1b36001 Compare April 7, 2026 06:53
@travisbreaks
Copy link
Copy Markdown
Author

Rebased on main. The stdin EOF approach is fully implemented per our discussion: end event listener on stdin triggers close(), with cleanup in close() to prevent double-close. Three new tests cover the behavior.

Ready for re-review when you have a chance, @felixweinberger.

Listen for the stdin `end` event to detect when the host process exits
or is killed, and close the transport accordingly. This prevents
orphaned server processes when the host terminates without cleanly
shutting down the server.

Implements the simpler stdin EOF pattern used by the Python and Kotlin
SDKs, as suggested in review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@KKonstantinov
Copy link
Copy Markdown
Contributor

@claude review

Comment on lines +1 to +5
---
'@modelcontextprotocol/server': patch
---

Close StdioServerTransport when stdin reaches EOF. When the host process exits or is killed, the stdin pipe closes and the transport now detects this via the `end` event, preventing orphaned server processes. Aligns with the behavior of the Python and Kotlin SDKs.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The PR title ("add host process watchdog") and description (clientProcessId, watchdogIntervalMs, process.kill(pid, 0) polling, StdioServerTransportOptions, constructor overloads, unref() timer, and the listed test plan) still describe the original implementation that was replaced on 2026-04-07 with the stdin-EOF approach — none of those exist in the shipped diff. Please update the PR title/body before merge so the squash-commit message matches what actually shipped; the changeset filename server-host-watchdog.md is also a stale leftover (cosmetic, since changeset files are deleted on release).

Extended reasoning...

What's wrong

The PR metadata describes an implementation that no longer exists. The title is "feat(server): add host process watchdog to StdioServerTransport" and the description documents:

  • a clientProcessId option
  • a watchdogIntervalMs option (default 3s)
  • polling via process.kill(pid, 0)
  • a new StdioServerTransportOptions interface
  • backwards-compatible constructor overloads
  • a timer with .unref()
  • a test plan listing should close transport when host process is gone, should not close when host process is alive, should stop watchdog on close, should accept options object constructor

Per the timeline (author comments on 2026-04-05 and 2026-04-07), this was completely rewritten to the stdin-EOF approach. The shipped diff contains none of the above.

What actually shipped

packages/server/src/server/stdio.ts adds exactly three lines of behavior:

_onend = () => void this.close().catch(error => this.onerror?.(error));
// ...
this._stdin.on('end', this._onend);   // in start()
this._stdin.off('end', this._onend);  // in close()

No options, no interface, no constructor overload, no timer, no PID polling. The tests are should close transport when stdin emits end, should not fire onclose twice when close() is called after stdin end, and should remove end listener on close — none of the four tests listed in the PR description exist.

Step-by-step proof

  1. PR description, bullet 1: "Adds clientProcessId option" → grep clientProcessId packages/server/src/server/stdio.ts → no matches.
  2. PR description, bullet 4: "Configurable check interval via watchdogIntervalMs" → grep watchdogIntervalMs → no matches.
  3. PR description, bullet 5: "New StdioServerTransportOptions interface" → grep StdioServerTransportOptions → no matches. (The earlier bot review flagging an unused import type { StdioServerTransportOptions } is also obsolete — that import no longer exists in the test file.)
  4. PR description, bullet 6: "Timer uses unref()" → grep unref → no matches.
  5. Test plan item 1: should close transport when host process is gone → not in stdio.test.ts; the actual test is should close transport when stdin emits end.
  6. Changeset body (.changeset/server-host-watchdog.md): "Close StdioServerTransport when stdin reaches EOF … detects this via the end event" — this is correct and matches the code. Only the filename is stale.

Why it matters

This repo squash-merges PRs, so the PR title and body become the permanent commit message on main. Anyone running git log or git blame on stdio.ts will see a commit claiming to add a PID-polling watchdog with a configurable interval and a new options interface — and find none of it in the tree. The repo's review conventions explicitly call out "prose that promises behavior the code no longer ships" as something to flag.

The published npm changelog will be fine (the changeset body is accurate), so this has zero runtime or API impact — hence nit severity.

Fix

Before merge:

  • Retitle to something like feat(server): close StdioServerTransport on stdin EOF to prevent orphaned processes.
  • Replace the PR description's Summary/Test plan with the stdin-EOF behavior and the three actual test names.
  • Optionally rename .changeset/server-host-watchdog.md → something like .changeset/server-stdin-eof.md (cosmetic — changeset files are deleted on release, so this only affects git history).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ways for server processes to self-terminate when host is gone

3 participants