adding docs for tests
This commit is contained in:
@@ -16,6 +16,7 @@ This documentation is organized into four main categories to serve different aud
|
||||
- **[Development Workflows](contributing/workflows.md)** - Git workflows, testing, and collaboration patterns
|
||||
- **[Code Style Guide](contributing/code-style.md)** - Coding standards and best practices
|
||||
- **[Testing Strategy](contributing/testing.md)** - Testing approaches and quality assurance
|
||||
- **[Test Structure Analysis](contributing/test-structure.md)** - Detailed analysis of existing test suite
|
||||
- **[Repository Overview](contributing/file-overview.md)** - Understanding the codebase structure
|
||||
|
||||
### 🏗️ For System Architects
|
||||
|
||||
470
docs/contributing/test-structure.md
Normal file
470
docs/contributing/test-structure.md
Normal file
@@ -0,0 +1,470 @@
|
||||
# Test Structure Documentation
|
||||
|
||||
## Overview
|
||||
|
||||
This document provides a comprehensive analysis of the existing test structure in the Gemini MCP Server project. The test suite consists of **17 specialized test files** organized to validate all aspects of the system from unit-level functionality to complex AI collaboration workflows.
|
||||
|
||||
## Test Organization
|
||||
|
||||
### Test Directory Structure
|
||||
|
||||
```
|
||||
tests/
|
||||
├── __init__.py # Package initialization
|
||||
├── conftest.py # Global test configuration and fixtures
|
||||
├── test_claude_continuation.py # Claude continuation opportunities
|
||||
├── test_collaboration.py # AI-to-AI collaboration features
|
||||
├── test_config.py # Configuration validation
|
||||
├── test_conversation_history_bug.py # Bug fix regression tests
|
||||
├── test_conversation_memory.py # Redis-based conversation persistence
|
||||
├── test_cross_tool_continuation.py # Cross-tool conversation threading
|
||||
├── test_docker_path_integration.py # Docker environment path translation
|
||||
├── test_large_prompt_handling.py # Large prompt detection and handling
|
||||
├── test_live_integration.py # Live API testing (excluded from CI)
|
||||
├── test_precommit.py # Pre-commit validation and git integration
|
||||
├── test_prompt_regression.py # Normal prompt handling regression
|
||||
├── test_server.py # Main server functionality
|
||||
├── test_thinking_modes.py # Thinking mode functionality
|
||||
├── test_tools.py # Individual tool implementations
|
||||
└── test_utils.py # Utility function testing
|
||||
```
|
||||
|
||||
## Test Categories and Analysis
|
||||
|
||||
### 1. Core Functionality Tests
|
||||
|
||||
#### `test_server.py` - Main Server Functionality
|
||||
**Purpose**: Tests the core MCP server implementation and tool dispatch mechanism
|
||||
|
||||
**Key Test Classes**:
|
||||
- **Server startup and initialization**
|
||||
- **Tool registration and availability**
|
||||
- **Request routing and handling**
|
||||
- **Error propagation and handling**
|
||||
|
||||
**Example Coverage**:
|
||||
```python
|
||||
# Tests tool listing functionality
|
||||
def test_list_tools()
|
||||
|
||||
# Tests tool execution pipeline
|
||||
async def test_call_tool()
|
||||
|
||||
# Tests error handling for invalid tools
|
||||
async def test_call_invalid_tool()
|
||||
```
|
||||
|
||||
#### `test_config.py` - Configuration Management
|
||||
**Purpose**: Validates configuration loading, environment variable handling, and settings validation
|
||||
|
||||
**Key Areas**:
|
||||
- **Environment variable parsing**
|
||||
- **Default value handling**
|
||||
- **Configuration validation**
|
||||
- **Error handling for missing required config**
|
||||
|
||||
#### `test_tools.py` - Tool Implementation Testing
|
||||
**Purpose**: Tests individual tool implementations with comprehensive input validation
|
||||
|
||||
**Key Features**:
|
||||
- **Absolute path enforcement across all tools**
|
||||
- **Parameter validation for each tool**
|
||||
- **Error handling for malformed inputs**
|
||||
- **Tool-specific behavior validation**
|
||||
|
||||
**Critical Security Testing**:
|
||||
```python
|
||||
# Tests that all tools enforce absolute paths
|
||||
async def test_tool_absolute_path_requirement()
|
||||
|
||||
# Tests path traversal attack prevention
|
||||
async def test_tool_path_traversal_prevention()
|
||||
```
|
||||
|
||||
#### `test_utils.py` - Utility Function Testing
|
||||
**Purpose**: Tests file utilities, token counting, and directory handling functions
|
||||
|
||||
**Coverage Areas**:
|
||||
- **File reading and processing**
|
||||
- **Token counting and limits**
|
||||
- **Directory traversal and expansion**
|
||||
- **Path validation and security**
|
||||
|
||||
### 2. Advanced Feature Tests
|
||||
|
||||
#### `test_collaboration.py` - AI-to-AI Collaboration
|
||||
**Purpose**: Tests dynamic context requests and collaborative AI workflows
|
||||
|
||||
**Key Scenarios**:
|
||||
- **Clarification request parsing**
|
||||
- **Dynamic context expansion**
|
||||
- **AI-to-AI communication protocols**
|
||||
- **Collaboration workflow validation**
|
||||
|
||||
**Example Test**:
|
||||
```python
|
||||
async def test_clarification_request_parsing():
|
||||
"""Test parsing of AI clarification requests for additional context."""
|
||||
# Validates that Gemini can request additional files/context
|
||||
# and Claude can respond appropriately
|
||||
```
|
||||
|
||||
#### `test_cross_tool_continuation.py` - Cross-Tool Threading
|
||||
**Purpose**: Tests conversation continuity across different tools
|
||||
|
||||
**Critical Features**:
|
||||
- **Continuation ID persistence**
|
||||
- **Context preservation between tools**
|
||||
- **Thread management across tool switches**
|
||||
- **File context sharing between AI agents**
|
||||
|
||||
#### `test_conversation_memory.py` - Memory Persistence
|
||||
**Purpose**: Tests Redis-based conversation storage and retrieval
|
||||
|
||||
**Test Coverage**:
|
||||
- **Conversation storage and retrieval**
|
||||
- **Thread context management**
|
||||
- **TTL (time-to-live) handling**
|
||||
- **Memory cleanup and optimization**
|
||||
|
||||
#### `test_thinking_modes.py` - Cognitive Load Management
|
||||
**Purpose**: Tests thinking mode functionality across all tools
|
||||
|
||||
**Validation Areas**:
|
||||
- **Token budget enforcement**
|
||||
- **Mode selection and application**
|
||||
- **Performance characteristics**
|
||||
- **Quality vs. cost trade-offs**
|
||||
|
||||
### 3. Specialized Testing
|
||||
|
||||
#### `test_large_prompt_handling.py` - Scale Testing
|
||||
**Purpose**: Tests handling of prompts exceeding MCP token limits
|
||||
|
||||
**Key Scenarios**:
|
||||
- **Large prompt detection (>50,000 characters)**
|
||||
- **Automatic file-based prompt handling**
|
||||
- **MCP token limit workarounds**
|
||||
- **Response capacity preservation**
|
||||
|
||||
**Critical Flow Testing**:
|
||||
```python
|
||||
async def test_large_prompt_file_handling():
|
||||
"""Test that large prompts are automatically handled via file mechanism."""
|
||||
# Validates the workaround for MCP's 25K token limit
|
||||
```
|
||||
|
||||
#### `test_docker_path_integration.py` - Environment Testing
|
||||
**Purpose**: Tests Docker environment path translation and workspace mounting
|
||||
|
||||
**Coverage**:
|
||||
- **Host-to-container path mapping**
|
||||
- **Workspace directory access**
|
||||
- **Cross-platform path handling**
|
||||
- **Security boundary enforcement**
|
||||
|
||||
#### `test_precommit.py` - Quality Gate Testing
|
||||
**Purpose**: Tests pre-commit validation and git integration
|
||||
|
||||
**Validation Areas**:
|
||||
- **Git repository discovery**
|
||||
- **Change detection and analysis**
|
||||
- **Multi-repository support**
|
||||
- **Security scanning of changes**
|
||||
|
||||
### 4. Regression and Bug Fix Tests
|
||||
|
||||
#### `test_conversation_history_bug.py` - Bug Fix Validation
|
||||
**Purpose**: Regression test for conversation history duplication bug
|
||||
|
||||
**Specific Coverage**:
|
||||
- **Conversation deduplication**
|
||||
- **History consistency**
|
||||
- **Memory leak prevention**
|
||||
- **Thread integrity**
|
||||
|
||||
#### `test_prompt_regression.py` - Normal Operation Validation
|
||||
**Purpose**: Ensures normal prompt handling continues to work correctly
|
||||
|
||||
**Test Focus**:
|
||||
- **Standard prompt processing**
|
||||
- **Backward compatibility**
|
||||
- **Feature regression prevention**
|
||||
- **Performance baseline maintenance**
|
||||
|
||||
#### `test_claude_continuation.py` - Session Management
|
||||
**Purpose**: Tests Claude continuation opportunities and session management
|
||||
|
||||
**Key Areas**:
|
||||
- **Session state management**
|
||||
- **Continuation opportunity detection**
|
||||
- **Context preservation**
|
||||
- **Session cleanup and termination**
|
||||
|
||||
### 5. Live Integration Testing
|
||||
|
||||
#### `test_live_integration.py` - Real API Testing
|
||||
**Purpose**: Tests actual Gemini API integration (excluded from regular CI)
|
||||
|
||||
**Requirements**:
|
||||
- Valid `GEMINI_API_KEY` environment variable
|
||||
- Network connectivity to Google AI services
|
||||
- Redis server for conversation memory testing
|
||||
|
||||
**Test Categories**:
|
||||
- **Basic API request/response validation**
|
||||
- **Tool execution with real Gemini responses**
|
||||
- **Conversation threading with actual AI**
|
||||
- **Error handling with real API responses**
|
||||
|
||||
**Exclusion from CI**:
|
||||
```python
|
||||
@pytest.mark.skipif(not os.getenv("GEMINI_API_KEY"), reason="API key required")
|
||||
class TestLiveIntegration:
|
||||
"""Tests requiring actual Gemini API access."""
|
||||
```
|
||||
|
||||
## Test Configuration Analysis
|
||||
|
||||
### `conftest.py` - Global Test Setup
|
||||
|
||||
**Key Fixtures and Configuration**:
|
||||
|
||||
#### Environment Isolation
|
||||
```python
|
||||
# Ensures tests run in isolated sandbox environment
|
||||
os.environ["MCP_PROJECT_ROOT"] = str(temp_dir)
|
||||
```
|
||||
|
||||
#### Dummy API Keys
|
||||
```python
|
||||
# Provides safe dummy keys for testing without real credentials
|
||||
os.environ["GEMINI_API_KEY"] = "dummy-key-for-testing"
|
||||
```
|
||||
|
||||
#### Cross-Platform Compatibility
|
||||
```python
|
||||
# Handles Windows async event loop configuration
|
||||
if platform.system() == "Windows":
|
||||
asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy())
|
||||
```
|
||||
|
||||
#### Project Path Fixtures
|
||||
```python
|
||||
@pytest.fixture
|
||||
def project_path():
|
||||
"""Provides safe project path for file operations in tests."""
|
||||
```
|
||||
|
||||
### `pytest.ini` - Test Runner Configuration
|
||||
|
||||
**Key Settings**:
|
||||
```ini
|
||||
[pytest]
|
||||
testpaths = tests
|
||||
python_files = test_*.py
|
||||
python_classes = Test*
|
||||
python_functions = test_*
|
||||
asyncio_mode = auto
|
||||
addopts =
|
||||
-v
|
||||
--strict-markers
|
||||
--tb=short
|
||||
```
|
||||
|
||||
## Mocking Strategies
|
||||
|
||||
### 1. Gemini API Mocking
|
||||
|
||||
**Pattern Used**:
|
||||
```python
|
||||
@patch("tools.base.BaseTool.create_model")
|
||||
async def test_tool_execution(self, mock_create_model):
|
||||
mock_model = Mock()
|
||||
mock_model.generate_content.return_value = Mock(
|
||||
candidates=[Mock(content=Mock(parts=[Mock(text="Mocked response")]))]
|
||||
)
|
||||
mock_create_model.return_value = mock_model
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- **No API key required** for unit and integration tests
|
||||
- **Predictable responses** for consistent testing
|
||||
- **Fast execution** without network dependencies
|
||||
- **Cost-effective** testing without API charges
|
||||
|
||||
### 2. Redis Memory Mocking
|
||||
|
||||
**Pattern Used**:
|
||||
```python
|
||||
@patch("utils.conversation_memory.get_redis_client")
|
||||
def test_conversation_flow(self, mock_redis):
|
||||
mock_client = Mock()
|
||||
mock_redis.return_value = mock_client
|
||||
# Test conversation persistence logic
|
||||
```
|
||||
|
||||
**Advantages**:
|
||||
- **No Redis server required** for testing
|
||||
- **Controlled state** for predictable test scenarios
|
||||
- **Error simulation** for resilience testing
|
||||
|
||||
### 3. File System Mocking
|
||||
|
||||
**Pattern Used**:
|
||||
```python
|
||||
@patch("builtins.open", mock_open(read_data="test file content"))
|
||||
@patch("os.path.exists", return_value=True)
|
||||
def test_file_operations():
|
||||
# Test file reading without actual files
|
||||
```
|
||||
|
||||
**Security Benefits**:
|
||||
- **No file system access** during testing
|
||||
- **Path validation testing** without security risks
|
||||
- **Consistent test data** across environments
|
||||
|
||||
## Security Testing Focus
|
||||
|
||||
### Path Validation Testing
|
||||
|
||||
**Critical Security Tests**:
|
||||
1. **Absolute path enforcement** - All tools must reject relative paths
|
||||
2. **Directory traversal prevention** - Block `../` and similar patterns
|
||||
3. **Symlink attack prevention** - Detect and block symbolic link attacks
|
||||
4. **Sandbox boundary enforcement** - Restrict access to allowed directories
|
||||
|
||||
**Example Security Test**:
|
||||
```python
|
||||
async def test_path_traversal_attack_prevention():
|
||||
"""Test that directory traversal attacks are blocked."""
|
||||
dangerous_paths = [
|
||||
"../../../etc/passwd",
|
||||
"/etc/shadow",
|
||||
"~/../../root/.ssh/id_rsa"
|
||||
]
|
||||
|
||||
for path in dangerous_paths:
|
||||
with pytest.raises(SecurityError):
|
||||
await tool.execute({"files": [path]})
|
||||
```
|
||||
|
||||
### Docker Security Testing
|
||||
|
||||
**Container Security Validation**:
|
||||
- **Workspace mounting** - Verify read-only access enforcement
|
||||
- **Path translation** - Test host-to-container path mapping
|
||||
- **Privilege boundaries** - Ensure container cannot escape sandbox
|
||||
|
||||
## Test Execution Patterns
|
||||
|
||||
### Parallel Test Execution
|
||||
|
||||
**Strategy**: Tests are designed for parallel execution with proper isolation
|
||||
|
||||
**Benefits**:
|
||||
- **Faster test suite** execution
|
||||
- **Resource efficiency** for CI/CD
|
||||
- **Scalable testing** for large codebases
|
||||
|
||||
### Conditional Test Execution
|
||||
|
||||
**Live Test Skipping**:
|
||||
```python
|
||||
@pytest.mark.skipif(not os.getenv("GEMINI_API_KEY"), reason="API key required")
|
||||
```
|
||||
|
||||
**Platform-Specific Tests**:
|
||||
```python
|
||||
@pytest.mark.skipif(platform.system() == "Windows", reason="Unix-specific test")
|
||||
```
|
||||
|
||||
## Test Quality Metrics
|
||||
|
||||
### Coverage Analysis
|
||||
|
||||
**Current Test Coverage by Category**:
|
||||
- ✅ **Tool Functionality**: All 7 tools comprehensively tested
|
||||
- ✅ **Server Operations**: Complete request/response cycle coverage
|
||||
- ✅ **Security Validation**: Path safety and access control testing
|
||||
- ✅ **Collaboration Features**: AI-to-AI communication patterns
|
||||
- ✅ **Memory Management**: Conversation persistence and threading
|
||||
- ✅ **Error Handling**: Graceful degradation and error recovery
|
||||
|
||||
### Test Reliability
|
||||
|
||||
**Design Characteristics**:
|
||||
- **Deterministic**: Tests produce consistent results
|
||||
- **Isolated**: No test dependencies or shared state
|
||||
- **Fast**: Unit tests complete in milliseconds
|
||||
- **Comprehensive**: Edge cases and error conditions covered
|
||||
|
||||
## Integration with Development Workflow
|
||||
|
||||
### Test-Driven Development Support
|
||||
|
||||
**TDD Cycle Integration**:
|
||||
1. **Red**: Write failing test for new functionality
|
||||
2. **Green**: Implement minimal code to pass test
|
||||
3. **Refactor**: Improve code while maintaining test coverage
|
||||
|
||||
### Pre-Commit Testing
|
||||
|
||||
**Quality Gates**:
|
||||
- **Security validation** before commits
|
||||
- **Functionality regression** prevention
|
||||
- **Code quality** maintenance
|
||||
- **Performance baseline** protection
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
**GitHub Actions Workflow**:
|
||||
- **Multi-Python version** testing (3.10, 3.11, 3.12)
|
||||
- **Parallel test execution** for efficiency
|
||||
- **Selective live testing** when API keys available
|
||||
- **Coverage reporting** and quality gates
|
||||
|
||||
## Best Practices Demonstrated
|
||||
|
||||
### 1. Comprehensive Mocking
|
||||
Every external dependency is properly mocked for reliable testing
|
||||
|
||||
### 2. Security-First Approach
|
||||
Strong emphasis on security validation and vulnerability prevention
|
||||
|
||||
### 3. Collaboration Testing
|
||||
Extensive testing of AI-to-AI communication and workflow patterns
|
||||
|
||||
### 4. Real-World Scenarios
|
||||
Tests cover actual usage patterns and edge cases
|
||||
|
||||
### 5. Maintainable Structure
|
||||
Clear organization and focused test files for easy maintenance
|
||||
|
||||
## Recommendations for Contributors
|
||||
|
||||
### Adding New Tests
|
||||
|
||||
1. **Follow Naming Conventions**: Use descriptive test names that explain the scenario
|
||||
2. **Maintain Isolation**: Mock all external dependencies
|
||||
3. **Test Security**: Include path validation and security checks
|
||||
4. **Cover Edge Cases**: Test error conditions and boundary cases
|
||||
5. **Document Purpose**: Use docstrings to explain test objectives
|
||||
|
||||
### Test Quality Standards
|
||||
|
||||
1. **Fast Execution**: Unit tests should complete in milliseconds
|
||||
2. **Predictable Results**: Tests should be deterministic
|
||||
3. **Clear Assertions**: Use descriptive assertion messages
|
||||
4. **Proper Cleanup**: Ensure tests don't leave side effects
|
||||
|
||||
### Testing New Features
|
||||
|
||||
1. **Start with Unit Tests**: Test individual components first
|
||||
2. **Add Integration Tests**: Test component interactions
|
||||
3. **Include Security Tests**: Validate security measures
|
||||
4. **Test Collaboration**: If relevant, test AI-to-AI workflows
|
||||
|
||||
---
|
||||
|
||||
This test structure demonstrates a mature, production-ready testing approach that ensures code quality, security, and reliability while supporting the collaborative AI development patterns that make this project unique.
|
||||
@@ -679,6 +679,12 @@ jobs:
|
||||
file: ./coverage.xml
|
||||
```
|
||||
|
||||
## Detailed Test Structure Analysis
|
||||
|
||||
For a comprehensive analysis of the existing test suite, including detailed breakdowns of all 17 test files, security testing patterns, and collaboration feature validation, see:
|
||||
|
||||
**[Test Structure Documentation](test-structure.md)** - Complete analysis of existing test organization, mocking strategies, and quality assurance patterns
|
||||
|
||||
---
|
||||
|
||||
This comprehensive testing strategy ensures high-quality, reliable code while maintaining development velocity and supporting the collaborative patterns defined in CLAUDE.md.
|
||||
Reference in New Issue
Block a user