Perform prompt size checks only at the MCP boundary

New test to confirm history build-up and system prompt does not affect prompt size checks
Also check for large prompts in focus_on
Fixed .env.example incorrectly did not comment out CUSTOM_API causing the run-server script to think at least one key exists
This commit is contained in:
Fahad
2025-06-15 10:37:08 +04:00
parent 3b03783ea7
commit 4becd70a82
14 changed files with 404 additions and 198 deletions

View File

@@ -124,21 +124,26 @@ python communication_simulator_test.py --verbose
python communication_simulator_test.py --rebuild
```
#### Run Individual Simulator Tests
#### Run Individual Simulator Tests (Recommended)
```bash
# List all available tests
python communication_simulator_test.py --list-tests
# Run a specific test individually (with full Docker setup)
# RECOMMENDED: Run tests individually for better isolation and debugging
python communication_simulator_test.py --individual basic_conversation
python communication_simulator_test.py --individual content_validation
python communication_simulator_test.py --individual cross_tool_continuation
python communication_simulator_test.py --individual logs_validation
python communication_simulator_test.py --individual redis_validation
# Run multiple specific tests
# Run multiple specific tests (alternative approach)
python communication_simulator_test.py --tests basic_conversation content_validation
# Run individual test with verbose output
# Run individual test with verbose output for debugging
python communication_simulator_test.py --individual logs_validation --verbose
# Individual tests provide full Docker setup and teardown per test
# This ensures clean state and better error isolation
```
Available simulator tests include:
@@ -146,16 +151,21 @@ Available simulator tests include:
- `content_validation` - Content validation and duplicate detection
- `per_tool_deduplication` - File deduplication for individual tools
- `cross_tool_continuation` - Cross-tool conversation continuation scenarios
- `cross_tool_comprehensive` - Comprehensive cross-tool integration testing
- `cross_tool_comprehensive` - Comprehensive cross-tool file deduplication and continuation
- `line_number_validation` - Line number handling validation across tools
- `logs_validation` - Docker logs validation
- `redis_validation` - Redis conversation memory validation
- `model_thinking_config` - Model thinking configuration testing
- `o3_model_selection` - O3 model selection and routing testing
- `ollama_custom_url` - Ollama custom URL configuration testing
- `openrouter_fallback` - OpenRouter fallback mechanism testing
- `openrouter_models` - OpenRouter models availability testing
- `token_allocation_validation` - Token allocation and limits validation
- `conversation_chain_validation` - Conversation chain continuity validation
- `model_thinking_config` - Model-specific thinking configuration behavior
- `o3_model_selection` - O3 model selection and usage validation
- `ollama_custom_url` - Ollama custom URL endpoint functionality
- `openrouter_fallback` - OpenRouter fallback behavior when only provider
- `openrouter_models` - OpenRouter model functionality and alias mapping
- `token_allocation_validation` - Token allocation and conversation history validation
- `testgen_validation` - TestGen tool validation with specific test function
- `refactor_validation` - Refactor tool validation with codesmells
- `conversation_chain_validation` - Conversation chain and threading validation
**Note**: All simulator tests should be run individually for optimal testing and better error isolation.
#### Run Unit Tests Only
```bash