Docs added to show how a new tool is created All tools should add numbers to code for models to be able to reference if needed Enabled line numbering for code for all tools to use Additional tests to validate line numbering is not added to git diffs
147 lines
5.1 KiB
Markdown
147 lines
5.1 KiB
Markdown
# Testing Guide
|
|
|
|
This project includes comprehensive test coverage through unit tests and integration simulator tests.
|
|
|
|
## Running Tests
|
|
|
|
### Prerequisites
|
|
- Python virtual environment activated: `source venv/bin/activate`
|
|
- All dependencies installed: `pip install -r requirements.txt`
|
|
- Docker containers running (for simulator tests): `./run-server.sh`
|
|
|
|
### Unit Tests
|
|
|
|
Run all unit tests with pytest:
|
|
```bash
|
|
# Run all tests with verbose output
|
|
python -m pytest -xvs
|
|
|
|
# Run specific test file
|
|
python -m pytest tests/test_providers.py -xvs
|
|
```
|
|
|
|
### Simulator Tests
|
|
|
|
Simulator tests replicate real-world Claude CLI interactions with the MCP server running in Docker. Unlike unit tests that test isolated functions, simulator tests validate the complete end-to-end flow including:
|
|
- Actual MCP protocol communication
|
|
- Docker container interactions
|
|
- Multi-turn conversations across tools
|
|
- Log output validation
|
|
|
|
**Important**: Simulator tests require `LOG_LEVEL=DEBUG` in your `.env` file to validate detailed execution logs.
|
|
|
|
#### Monitoring Logs During Tests
|
|
|
|
**Important**: The MCP stdio protocol interferes with stderr output during tool execution. While server startup logs appear in `docker compose logs`, tool execution logs are only written to file-based logs inside the container. This is a known limitation of the stdio-based MCP protocol and cannot be fixed without changing the MCP implementation.
|
|
|
|
To monitor logs during test execution:
|
|
|
|
```bash
|
|
# Monitor main server logs (includes all tool execution details)
|
|
docker exec zen-mcp-server tail -f -n 500 /tmp/mcp_server.log
|
|
|
|
# Monitor MCP activity logs (tool calls and completions)
|
|
docker exec zen-mcp-server tail -f /tmp/mcp_activity.log
|
|
|
|
# Check log file sizes (logs rotate at 20MB)
|
|
docker exec zen-mcp-server ls -lh /tmp/mcp_*.log*
|
|
```
|
|
|
|
**Log Rotation**: All log files are configured with automatic rotation at 20MB to prevent disk space issues. The server keeps:
|
|
- 10 rotated files for mcp_server.log (200MB total)
|
|
- 5 rotated files for mcp_activity.log (100MB total)
|
|
|
|
**Why logs don't appear in docker compose logs**: The MCP stdio_server captures stderr during tool execution to prevent interference with the JSON-RPC protocol communication. This means that while you'll see startup logs in `docker compose logs`, you won't see tool execution logs there.
|
|
|
|
#### Running All Simulator Tests
|
|
```bash
|
|
# Run all simulator tests
|
|
python communication_simulator_test.py
|
|
|
|
# Run with verbose output for debugging
|
|
python communication_simulator_test.py --verbose
|
|
|
|
# Keep Docker logs after tests for inspection
|
|
python communication_simulator_test.py --keep-logs
|
|
```
|
|
|
|
#### Running Individual Tests
|
|
To run a single simulator test in isolation (useful for debugging or test development):
|
|
|
|
```bash
|
|
# Run a specific test by name
|
|
python communication_simulator_test.py --individual basic_conversation
|
|
|
|
# Examples of available tests:
|
|
python communication_simulator_test.py --individual content_validation
|
|
python communication_simulator_test.py --individual cross_tool_continuation
|
|
python communication_simulator_test.py --individual redis_validation
|
|
```
|
|
|
|
#### Other Options
|
|
```bash
|
|
# List all available simulator tests with descriptions
|
|
python communication_simulator_test.py --list-tests
|
|
|
|
# Run multiple specific tests (not all)
|
|
python communication_simulator_test.py --tests basic_conversation content_validation
|
|
|
|
# Force Docker environment rebuild before running tests
|
|
python communication_simulator_test.py --rebuild
|
|
```
|
|
|
|
### Code Quality Checks
|
|
|
|
Before committing, ensure all linting passes:
|
|
```bash
|
|
# Run all linting checks
|
|
ruff check .
|
|
black --check .
|
|
isort --check-only .
|
|
|
|
# Auto-fix issues
|
|
ruff check . --fix
|
|
black .
|
|
isort .
|
|
```
|
|
|
|
## What Each Test Suite Covers
|
|
|
|
### Unit Tests
|
|
Test isolated components and functions:
|
|
- **Provider functionality**: Model initialization, API interactions, capability checks
|
|
- **Tool operations**: All MCP tools (chat, analyze, debug, etc.)
|
|
- **Conversation memory**: Threading, continuation, history management
|
|
- **File handling**: Path validation, token limits, deduplication
|
|
- **Auto mode**: Model selection logic and fallback behavior
|
|
|
|
### Simulator Tests
|
|
Validate real-world usage scenarios by simulating actual Claude prompts:
|
|
- **Basic conversations**: Multi-turn chat functionality with real prompts
|
|
- **Cross-tool continuation**: Context preservation across different tools
|
|
- **File deduplication**: Efficient handling of repeated file references
|
|
- **Model selection**: Proper routing to configured providers
|
|
- **Token allocation**: Context window management in practice
|
|
- **Redis validation**: Conversation persistence and retrieval
|
|
|
|
## Contributing
|
|
|
|
For detailed contribution guidelines, testing requirements, and code quality standards, please see our [Contributing Guide](./contributions.md).
|
|
|
|
### Quick Testing Reference
|
|
|
|
```bash
|
|
# Activate virtual environment
|
|
source venv/bin/activate
|
|
|
|
# Run linting checks
|
|
ruff check . && black --check . && isort --check-only .
|
|
|
|
# Run unit tests
|
|
python -m pytest -xvs
|
|
|
|
# Run simulator tests (for tool changes)
|
|
python communication_simulator_test.py
|
|
```
|
|
|
|
Remember: All tests must pass before submitting a PR. See the [Contributing Guide](./contributions.md) for complete requirements. |