-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fix stdio_client kill process after timeout #555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @cristipufu for working on this!
Do you have steps to reproduce? How this changes were tested?
await process.wait() | ||
except TimeoutError: | ||
# Force kill if it doesn't terminate | ||
process.kill() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the same TerminateProcess
is called on Windows... Did this worked locally?
https://anyio.readthedocs.io/en/stable/api.html#anyio.abc.Process.kill
Slightly orthogonal but this does not handle child process spawns which can cause the parent to leave ghost processes which are being re-parented to the root process. Addressed it here: #850 |
Add comprehensive tests that validate stdio client cleanup behavior: - Universal timeout test ensures cleanup completes within reasonable time even with long-running processes - Immediate completion test ensures fast processes remain fast and timeout mechanism doesn't introduce delays - Cross-platform compatibility with different commands for Windows vs Unix These tests will help ensure PR #555's timeout mechanism works correctly without regressing normal operation. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Changes from PR #555 by @cristipufu: - Remove Windows-specific terminate_windows_process function - Apply uniform 2-second timeout mechanism across all platforms: - Try process.terminate() first (SIGTERM on Unix, similar on Windows) - Wait up to 2 seconds for graceful termination - If timeout occurs, force process.kill() (SIGKILL equivalent) This ensures consistent behavior between Windows and Unix systems when cleaning up stdio client processes, preventing hanging when MCP servers don't respond to SIGTERM. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This change demonstrates a simpler approach than PR #555's timeout mechanism: 1. **Evidence**: Extensive testing shows PR #559 stream cleanup alone prevents hanging, even with servers that ignore SIGTERM and keep streams open. 2. **Simplification**: Removes all timeout logic and Windows-specific termination function in favor of unified `process.terminate()` + stream cleanup. 3. **Benefits**: - Less code complexity (no timeout handling, no platform branching) - Preserves proven stream cleanup protection from PR #559 - Makes behavior consistent across all platforms - All existing timeout tests still pass 4. **Risk reduction**: Avoids changing process termination semantics while maintaining hanging protection through stream cleanup. The core insight: process hanging was caused by stream management issues (solved by PR #559), not termination timing issues (targeted by PR #555).
NOTE: this test FAIL without #555 This test verifies that stdio_client properly handles processes that ignore SIGTERM but respond to SIGINT, preventing hanging issues with Node.js servers and other interactive tools that expect Ctrl+C signals.
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill()
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill()
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill() This test verifies that on UNIX systems MCP servers that don't respect SIGTERM but e.g. SIGINT still get terminated after a grace period.
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill() This test verifies that on UNIX systems MCP servers that don't respect SIGTERM but e.g. SIGINT still get terminated after a grace period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cristipufu, thank you for working on this and for submitting this PR!
Having looked into this in some detail I think your simplification makes sense:
- There shouldn't be any change in behavior on Windows from this change, so we're good on that end
- Fix hanging on streams when stdio_client exiting #559 already made some changes to stream cleanup and I believe addresses SIGTERM hangs already on UNIX systems
- As you mention on Fixes to stdio_client to support Windows more robustly #372 though, MCP servers that ignore SIGTERM may still be an issue on UNIX systems
- I was able to create a test (see below) that simulates a service that expects SIGINT and ignores SIGTERM - this PR fixes that case.
I created a draft PR #1044 with 3 new regression tests in an attempt to actually reproduce the hanging resolved by both this change (#555) and #559.
- Fix hanging on streams when stdio_client exiting #559 resolves
test_stdio_client_universal_cleanup
andtest_stdio_client_immediate_completion
- Fix stdio_client kill process after timeout #555 (this PR) resolves
test_stdio_client_sigint_only_process
Please take a look if that reflects what you were going for here - if yes could you rebase your PR here on latest main?
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill() This test verifies that on UNIX systems MCP servers that don't respect SIGTERM but e.g. SIGINT still get terminated after a grace period.
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill() This test verifies that on UNIX systems MCP servers that don't respect SIGTERM but e.g. SIGINT still get terminated after a grace period.
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill() This test verifies that on UNIX systems MCP servers that don't respect SIGTERM but e.g. SIGINT still get terminated after a grace period.
**NOTE: These tests FAIL without the changes introduced in #559** - test_stdio_client_universal_timeout - test_stdio_client_immediate_completion await read_stream.aclose() await write_stream.aclose() await read_stream_writer.aclose() await write_stream_reader.aclose() These tests verify that stdio_client completes cleanup within reasonable time for both slow-terminating and fast-exiting processes, preventing the hanging issues reported in #559. **NOTE: This test FAILS without the changes introduced in #555** - test_stdio_client_sigint_only_process try: process.terminate() with anyio.fail_after(2.0): await process.wait() except TimeoutError: # If process doesn't terminate in time, force kill it process.kill() This test verifies that on UNIX systems MCP servers that don't respect SIGTERM but e.g. SIGINT still get terminated after a grace period.
Motivation and Context
Be consistent across platforms (Windows, Linux, etc). Some MCP Servers don't react to SIGTERM, so
stdio_client
will hang indefinitely when exiting the context manager.Reference: #372
How Has This Been Tested?
Yes
Breaking Changes
No
Types of changes
Checklist