Merge branch 'main' into feature/docs_workflow
This commit is contained in:
20
.env.example
20
.env.example
@@ -5,10 +5,22 @@
|
|||||||
# Get your API key from: https://makersuite.google.com/app/apikey
|
# Get your API key from: https://makersuite.google.com/app/apikey
|
||||||
GEMINI_API_KEY=your_gemini_api_key_here
|
GEMINI_API_KEY=your_gemini_api_key_here
|
||||||
|
|
||||||
# Optional: Redis connection URL for conversation memory
|
# Optional: Default model to use
|
||||||
# Defaults to redis://localhost:6379/0
|
# Full names: 'gemini-2.5-pro-preview-06-05' or 'gemini-2.0-flash-exp'
|
||||||
# For Docker: redis://redis:6379/0
|
# Defaults to gemini-2.5-pro-preview-06-05 if not specified
|
||||||
REDIS_URL=redis://localhost:6379/0
|
DEFAULT_MODEL=gemini-2.5-pro-preview-06-05
|
||||||
|
|
||||||
|
# Optional: Default thinking mode for ThinkDeep tool
|
||||||
|
# NOTE: Only applies to models that support extended thinking (e.g., Gemini 2.5 Pro)
|
||||||
|
# Flash models (2.0) will use system prompt engineering instead
|
||||||
|
# Token consumption per mode:
|
||||||
|
# minimal: 128 tokens - Quick analysis, fastest response
|
||||||
|
# low: 2,048 tokens - Light reasoning tasks
|
||||||
|
# medium: 8,192 tokens - Balanced reasoning (good for most cases)
|
||||||
|
# high: 16,384 tokens - Complex analysis (recommended for thinkdeep)
|
||||||
|
# max: 32,768 tokens - Maximum reasoning depth, slowest but most thorough
|
||||||
|
# Defaults to 'high' if not specified
|
||||||
|
DEFAULT_THINKING_MODE_THINKDEEP=high
|
||||||
|
|
||||||
# Optional: Workspace root directory for file access
|
# Optional: Workspace root directory for file access
|
||||||
# This should be the HOST path that contains all files Claude might reference
|
# This should be the HOST path that contains all files Claude might reference
|
||||||
|
|||||||
53
README.md
53
README.md
@@ -10,12 +10,23 @@
|
|||||||
|
|
||||||
> **📚 [Comprehensive Documentation Available](docs/)** - This README provides quick start instructions. For detailed guides, API references, architecture documentation, and development workflows, see our [complete documentation](docs/).
|
> **📚 [Comprehensive Documentation Available](docs/)** - This README provides quick start instructions. For detailed guides, API references, architecture documentation, and development workflows, see our [complete documentation](docs/).
|
||||||
|
|
||||||
The ultimate development partner for Claude - a Model Context Protocol server that gives Claude access to Google's Gemini 2.5 Pro for extended thinking, code analysis, and problem-solving. **Automatically reads files and directories, passing their contents to Gemini for analysis within its 1M token context.**
|
The ultimate development partner for Claude - a Model Context Protocol server that gives Claude access to Google's Gemini models (2.5 Pro for extended thinking, 2.0 Flash for speed) for code analysis, problem-solving, and collaborative development. **Automatically reads files and directories, passing their contents to Gemini for analysis within its 1M token context.**
|
||||||
|
|
||||||
**Features true AI orchestration with conversation continuity across tool usage** - start a task with one tool, continue with another, and maintain full context throughout. Claude and Gemini can collaborate seamlessly across multiple interactions and different tools, creating a unified development experience.
|
**Features true AI orchestration with conversations that continue across tasks** - Give Claude a complex task and ask it to collaborate with Gemini.
|
||||||
|
Claude stays in control, performs the actual work, but gets a second perspective from Gemini. Claude will talk to Gemini, work on implementation, then automatically resume the
|
||||||
|
conversation with Gemini while maintaining the full thread.
|
||||||
|
Claude can switch between different Gemini tools ([`thinkdeep`](#2-thinkdeep---extended-reasoning-partner) → [`chat`](#1-chat---general-development-chat--collaborative-thinking) → [`precommit`](#4-precommit---pre-commit-validation) → [`codereview`](#3-codereview---professional-code-review)) and the conversation context carries forward seamlessly.
|
||||||
|
For example, in the video above, Claude was asked to debate SwiftUI vs UIKit with Gemini, resulting in a back-and-forth discussion rather than a simple one-shot query and response.
|
||||||
|
|
||||||
**Think of it as Claude Code _for_ Claude Code.**
|
**Think of it as Claude Code _for_ Claude Code.**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
> ⚠️ **Active Development Notice**
|
||||||
|
> This project is under rapid development with frequent commits and changes over the past few days.
|
||||||
|
> The goal is to expand support beyond Gemini to include additional AI models and providers.
|
||||||
|
> **Watch this space** for new capabilities and potentially breaking changes in between updates!
|
||||||
|
|
||||||
## Quick Navigation
|
## Quick Navigation
|
||||||
|
|
||||||
- **Getting Started**
|
- **Getting Started**
|
||||||
@@ -38,6 +49,7 @@ The ultimate development partner for Claude - a Model Context Protocol server th
|
|||||||
- [`analyze`](#6-analyze---smart-file-analysis) - File analysis
|
- [`analyze`](#6-analyze---smart-file-analysis) - File analysis
|
||||||
|
|
||||||
- **Advanced Topics**
|
- **Advanced Topics**
|
||||||
|
- [Model Configuration](#model-configuration) - Pro vs Flash model selection
|
||||||
- [Thinking Modes](#thinking-modes---managing-token-costs--quality) - Control depth vs cost
|
- [Thinking Modes](#thinking-modes---managing-token-costs--quality) - Control depth vs cost
|
||||||
- [Working with Large Prompts](#working-with-large-prompts) - Bypass MCP's 25K token limit
|
- [Working with Large Prompts](#working-with-large-prompts) - Bypass MCP's 25K token limit
|
||||||
- [Web Search Integration](#web-search-integration) - Smart search recommendations
|
- [Web Search Integration](#web-search-integration) - Smart search recommendations
|
||||||
@@ -588,6 +600,7 @@ All tools that work with files support **both individual files and entire direct
|
|||||||
**`analyze`** - Analyze files or directories
|
**`analyze`** - Analyze files or directories
|
||||||
- `files`: List of file paths or directories (required)
|
- `files`: List of file paths or directories (required)
|
||||||
- `question`: What to analyze (required)
|
- `question`: What to analyze (required)
|
||||||
|
- `model`: pro|flash (default: server default)
|
||||||
- `analysis_type`: architecture|performance|security|quality|general
|
- `analysis_type`: architecture|performance|security|quality|general
|
||||||
- `output_format`: summary|detailed|actionable
|
- `output_format`: summary|detailed|actionable
|
||||||
- `thinking_mode`: minimal|low|medium|high|max (default: medium)
|
- `thinking_mode`: minimal|low|medium|high|max (default: medium)
|
||||||
@@ -595,11 +608,13 @@ All tools that work with files support **both individual files and entire direct
|
|||||||
|
|
||||||
```
|
```
|
||||||
"Use gemini to analyze the src/ directory for architectural patterns"
|
"Use gemini to analyze the src/ directory for architectural patterns"
|
||||||
"Get gemini to analyze main.py and tests/ to understand test coverage"
|
"Use flash to quickly analyze main.py and tests/ to understand test coverage"
|
||||||
|
"Use pro for deep analysis of the entire backend/ directory structure"
|
||||||
```
|
```
|
||||||
|
|
||||||
**`codereview`** - Review code files or directories
|
**`codereview`** - Review code files or directories
|
||||||
- `files`: List of file paths or directories (required)
|
- `files`: List of file paths or directories (required)
|
||||||
|
- `model`: pro|flash (default: server default)
|
||||||
- `review_type`: full|security|performance|quick
|
- `review_type`: full|security|performance|quick
|
||||||
- `focus_on`: Specific aspects to focus on
|
- `focus_on`: Specific aspects to focus on
|
||||||
- `standards`: Coding standards to enforce
|
- `standards`: Coding standards to enforce
|
||||||
@@ -607,12 +622,13 @@ All tools that work with files support **both individual files and entire direct
|
|||||||
- `thinking_mode`: minimal|low|medium|high|max (default: medium)
|
- `thinking_mode`: minimal|low|medium|high|max (default: medium)
|
||||||
|
|
||||||
```
|
```
|
||||||
"Use gemini to review the entire api/ directory for security issues"
|
"Use pro to review the entire api/ directory for security issues"
|
||||||
"Get gemini to review src/ with focus on performance, only show critical issues"
|
"Use flash to quickly review src/ with focus on performance, only show critical issues"
|
||||||
```
|
```
|
||||||
|
|
||||||
**`debug`** - Debug with file context
|
**`debug`** - Debug with file context
|
||||||
- `error_description`: Description of the issue (required)
|
- `error_description`: Description of the issue (required)
|
||||||
|
- `model`: pro|flash (default: server default)
|
||||||
- `error_context`: Stack trace or logs
|
- `error_context`: Stack trace or logs
|
||||||
- `files`: Files or directories related to the issue
|
- `files`: Files or directories related to the issue
|
||||||
- `runtime_info`: Environment details
|
- `runtime_info`: Environment details
|
||||||
@@ -626,6 +642,7 @@ All tools that work with files support **both individual files and entire direct
|
|||||||
|
|
||||||
**`thinkdeep`** - Extended analysis with file context
|
**`thinkdeep`** - Extended analysis with file context
|
||||||
- `current_analysis`: Your current thinking (required)
|
- `current_analysis`: Your current thinking (required)
|
||||||
|
- `model`: pro|flash (default: server default)
|
||||||
- `problem_context`: Additional context
|
- `problem_context`: Additional context
|
||||||
- `focus_areas`: Specific aspects to focus on
|
- `focus_areas`: Specific aspects to focus on
|
||||||
- `files`: Files or directories for context
|
- `files`: Files or directories for context
|
||||||
@@ -867,7 +884,31 @@ This enables better integration, error handling, and support for the dynamic con
|
|||||||
The server includes several configurable properties that control its behavior:
|
The server includes several configurable properties that control its behavior:
|
||||||
|
|
||||||
### Model Configuration
|
### Model Configuration
|
||||||
- **`GEMINI_MODEL`**: `"gemini-2.5-pro-preview-06-05"` - The latest Gemini 2.5 Pro model with native thinking support
|
|
||||||
|
**Default Model (Environment Variable):**
|
||||||
|
- **`DEFAULT_MODEL`**: Set your preferred default model globally
|
||||||
|
- Default: `"gemini-2.5-pro-preview-06-05"` (extended thinking capabilities)
|
||||||
|
- Alternative: `"gemini-2.0-flash-exp"` (faster responses)
|
||||||
|
|
||||||
|
**Per-Tool Model Selection:**
|
||||||
|
All tools support a `model` parameter for flexible model switching:
|
||||||
|
- **`"pro"`** → Gemini 2.5 Pro (extended thinking, slower, higher quality)
|
||||||
|
- **`"flash"`** → Gemini 2.0 Flash (faster responses, lower cost)
|
||||||
|
- **Full model names** → Direct model specification
|
||||||
|
|
||||||
|
**Examples:**
|
||||||
|
```env
|
||||||
|
# Set default globally in .env file
|
||||||
|
DEFAULT_MODEL=flash
|
||||||
|
```
|
||||||
|
|
||||||
|
```
|
||||||
|
# Per-tool usage in Claude
|
||||||
|
"Use flash to quickly analyze this function"
|
||||||
|
"Use pro for deep architectural analysis"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Token Limits:**
|
||||||
- **`MAX_CONTEXT_TOKENS`**: `1,000,000` - Maximum input context (1M tokens for Gemini 2.5 Pro)
|
- **`MAX_CONTEXT_TOKENS`**: `1,000,000` - Maximum input context (1M tokens for Gemini 2.5 Pro)
|
||||||
|
|
||||||
### Temperature Defaults
|
### Temperature Defaults
|
||||||
|
|||||||
463
communication_simulator_test.py
Normal file
463
communication_simulator_test.py
Normal file
@@ -0,0 +1,463 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Communication Simulator Test for Gemini MCP Server
|
||||||
|
|
||||||
|
This script provides comprehensive end-to-end testing of the Gemini MCP server
|
||||||
|
by simulating real Claude CLI communications and validating conversation
|
||||||
|
continuity, file handling, deduplication features, and clarification scenarios.
|
||||||
|
|
||||||
|
Test Flow:
|
||||||
|
1. Setup fresh Docker environment with clean containers
|
||||||
|
2. Load and run individual test modules
|
||||||
|
3. Validate system behavior through logs and Redis
|
||||||
|
4. Cleanup and report results
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python communication_simulator_test.py [--verbose] [--keep-logs] [--tests TEST_NAME...] [--individual TEST_NAME] [--skip-docker]
|
||||||
|
|
||||||
|
--tests: Run specific tests only (space-separated)
|
||||||
|
--list-tests: List all available tests
|
||||||
|
--individual: Run a single test individually
|
||||||
|
--skip-docker: Skip Docker setup (assumes containers are already running)
|
||||||
|
|
||||||
|
Available tests:
|
||||||
|
basic_conversation - Basic conversation flow with chat tool
|
||||||
|
per_tool_deduplication - File deduplication for individual tools
|
||||||
|
cross_tool_continuation - Cross-tool conversation continuation scenarios
|
||||||
|
content_validation - Content validation and duplicate detection
|
||||||
|
logs_validation - Docker logs validation
|
||||||
|
redis_validation - Redis conversation memory validation
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
# Run all tests
|
||||||
|
python communication_simulator_test.py
|
||||||
|
|
||||||
|
# Run only basic conversation and content validation tests
|
||||||
|
python communication_simulator_test.py --tests basic_conversation content_validation
|
||||||
|
|
||||||
|
# Run a single test individually (with full Docker setup)
|
||||||
|
python communication_simulator_test.py --individual content_validation
|
||||||
|
|
||||||
|
# Run a single test individually (assuming Docker is already running)
|
||||||
|
python communication_simulator_test.py --individual content_validation --skip-docker
|
||||||
|
|
||||||
|
# List available tests
|
||||||
|
python communication_simulator_test.py --list-tests
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
import time
|
||||||
|
|
||||||
|
|
||||||
|
class CommunicationSimulator:
|
||||||
|
"""Simulates real-world Claude CLI communication with MCP Gemini server"""
|
||||||
|
|
||||||
|
def __init__(self, verbose: bool = False, keep_logs: bool = False, selected_tests: list[str] = None):
|
||||||
|
self.verbose = verbose
|
||||||
|
self.keep_logs = keep_logs
|
||||||
|
self.selected_tests = selected_tests or []
|
||||||
|
self.temp_dir = None
|
||||||
|
self.container_name = "gemini-mcp-server"
|
||||||
|
self.redis_container = "gemini-mcp-redis"
|
||||||
|
|
||||||
|
# Import test registry
|
||||||
|
from simulator_tests import TEST_REGISTRY
|
||||||
|
|
||||||
|
self.test_registry = TEST_REGISTRY
|
||||||
|
|
||||||
|
# Available test methods mapping
|
||||||
|
self.available_tests = {
|
||||||
|
name: self._create_test_runner(test_class) for name, test_class in self.test_registry.items()
|
||||||
|
}
|
||||||
|
|
||||||
|
# Test result tracking
|
||||||
|
self.test_results = dict.fromkeys(self.test_registry.keys(), False)
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
log_level = logging.DEBUG if verbose else logging.INFO
|
||||||
|
logging.basicConfig(level=log_level, format="%(asctime)s - %(levelname)s - %(message)s")
|
||||||
|
self.logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
def _create_test_runner(self, test_class):
|
||||||
|
"""Create a test runner function for a test class"""
|
||||||
|
|
||||||
|
def run_test():
|
||||||
|
test_instance = test_class(verbose=self.verbose)
|
||||||
|
result = test_instance.run_test()
|
||||||
|
# Update results
|
||||||
|
test_name = test_instance.test_name
|
||||||
|
self.test_results[test_name] = result
|
||||||
|
return result
|
||||||
|
|
||||||
|
return run_test
|
||||||
|
|
||||||
|
def setup_test_environment(self) -> bool:
|
||||||
|
"""Setup fresh Docker environment"""
|
||||||
|
try:
|
||||||
|
self.logger.info("🚀 Setting up test environment...")
|
||||||
|
|
||||||
|
# Create temporary directory for test files
|
||||||
|
self.temp_dir = tempfile.mkdtemp(prefix="mcp_test_")
|
||||||
|
self.logger.debug(f"Created temp directory: {self.temp_dir}")
|
||||||
|
|
||||||
|
# Setup Docker environment
|
||||||
|
return self._setup_docker()
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Failed to setup test environment: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _setup_docker(self) -> bool:
|
||||||
|
"""Setup fresh Docker environment"""
|
||||||
|
try:
|
||||||
|
self.logger.info("🐳 Setting up Docker environment...")
|
||||||
|
|
||||||
|
# Stop and remove existing containers
|
||||||
|
self._run_command(["docker", "compose", "down", "--remove-orphans"], check=False, capture_output=True)
|
||||||
|
|
||||||
|
# Clean up any old containers/images
|
||||||
|
old_containers = [self.container_name, self.redis_container]
|
||||||
|
for container in old_containers:
|
||||||
|
self._run_command(["docker", "stop", container], check=False, capture_output=True)
|
||||||
|
self._run_command(["docker", "rm", container], check=False, capture_output=True)
|
||||||
|
|
||||||
|
# Build and start services
|
||||||
|
self.logger.info("📦 Building Docker images...")
|
||||||
|
result = self._run_command(["docker", "compose", "build", "--no-cache"], capture_output=True)
|
||||||
|
if result.returncode != 0:
|
||||||
|
self.logger.error(f"Docker build failed: {result.stderr}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info("🚀 Starting Docker services...")
|
||||||
|
result = self._run_command(["docker", "compose", "up", "-d"], capture_output=True)
|
||||||
|
if result.returncode != 0:
|
||||||
|
self.logger.error(f"Docker startup failed: {result.stderr}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Wait for services to be ready
|
||||||
|
self.logger.info("⏳ Waiting for services to be ready...")
|
||||||
|
time.sleep(10) # Give services time to initialize
|
||||||
|
|
||||||
|
# Verify containers are running
|
||||||
|
if not self._verify_containers():
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info("✅ Docker environment ready")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Docker setup failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _verify_containers(self) -> bool:
|
||||||
|
"""Verify that required containers are running"""
|
||||||
|
try:
|
||||||
|
result = self._run_command(["docker", "ps", "--format", "{{.Names}}"], capture_output=True)
|
||||||
|
running_containers = result.stdout.decode().strip().split("\\n")
|
||||||
|
|
||||||
|
required = [self.container_name, self.redis_container]
|
||||||
|
for container in required:
|
||||||
|
if container not in running_containers:
|
||||||
|
self.logger.error(f"Container not running: {container}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.debug(f"Verified containers running: {required}")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Container verification failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def simulate_claude_cli_session(self) -> bool:
|
||||||
|
"""Simulate a complete Claude CLI session with conversation continuity"""
|
||||||
|
try:
|
||||||
|
self.logger.info("🤖 Starting Claude CLI simulation...")
|
||||||
|
|
||||||
|
# If specific tests are selected, run only those
|
||||||
|
if self.selected_tests:
|
||||||
|
return self._run_selected_tests()
|
||||||
|
|
||||||
|
# Otherwise run all tests in order
|
||||||
|
test_sequence = list(self.test_registry.keys())
|
||||||
|
|
||||||
|
for test_name in test_sequence:
|
||||||
|
if not self._run_single_test(test_name):
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info("✅ All tests passed")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Claude CLI simulation failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _run_selected_tests(self) -> bool:
|
||||||
|
"""Run only the selected tests"""
|
||||||
|
try:
|
||||||
|
self.logger.info(f"🎯 Running selected tests: {', '.join(self.selected_tests)}")
|
||||||
|
|
||||||
|
for test_name in self.selected_tests:
|
||||||
|
if not self._run_single_test(test_name):
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info("✅ All selected tests passed")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Selected tests failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _run_single_test(self, test_name: str) -> bool:
|
||||||
|
"""Run a single test by name"""
|
||||||
|
try:
|
||||||
|
if test_name not in self.available_tests:
|
||||||
|
self.logger.error(f"Unknown test: {test_name}")
|
||||||
|
self.logger.info(f"Available tests: {', '.join(self.available_tests.keys())}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(f"🧪 Running test: {test_name}")
|
||||||
|
test_function = self.available_tests[test_name]
|
||||||
|
result = test_function()
|
||||||
|
|
||||||
|
if result:
|
||||||
|
self.logger.info(f"✅ Test {test_name} passed")
|
||||||
|
else:
|
||||||
|
self.logger.error(f"❌ Test {test_name} failed")
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Test {test_name} failed with exception: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def run_individual_test(self, test_name: str, skip_docker_setup: bool = False) -> bool:
|
||||||
|
"""Run a single test individually with optional Docker setup skip"""
|
||||||
|
try:
|
||||||
|
if test_name not in self.available_tests:
|
||||||
|
self.logger.error(f"Unknown test: {test_name}")
|
||||||
|
self.logger.info(f"Available tests: {', '.join(self.available_tests.keys())}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(f"🧪 Running individual test: {test_name}")
|
||||||
|
|
||||||
|
# Setup environment unless skipped
|
||||||
|
if not skip_docker_setup:
|
||||||
|
if not self.setup_test_environment():
|
||||||
|
self.logger.error("❌ Environment setup failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Run the single test
|
||||||
|
test_function = self.available_tests[test_name]
|
||||||
|
result = test_function()
|
||||||
|
|
||||||
|
if result:
|
||||||
|
self.logger.info(f"✅ Individual test {test_name} passed")
|
||||||
|
else:
|
||||||
|
self.logger.error(f"❌ Individual test {test_name} failed")
|
||||||
|
|
||||||
|
return result
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Individual test {test_name} failed with exception: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
if not skip_docker_setup and not self.keep_logs:
|
||||||
|
self.cleanup()
|
||||||
|
|
||||||
|
def get_available_tests(self) -> dict[str, str]:
|
||||||
|
"""Get available tests with descriptions"""
|
||||||
|
descriptions = {}
|
||||||
|
for name, test_class in self.test_registry.items():
|
||||||
|
# Create temporary instance to get description
|
||||||
|
temp_instance = test_class(verbose=False)
|
||||||
|
descriptions[name] = temp_instance.test_description
|
||||||
|
return descriptions
|
||||||
|
|
||||||
|
def print_test_summary(self):
|
||||||
|
"""Print comprehensive test results summary"""
|
||||||
|
print("\\n" + "=" * 70)
|
||||||
|
print("🧪 GEMINI MCP COMMUNICATION SIMULATOR - TEST RESULTS SUMMARY")
|
||||||
|
print("=" * 70)
|
||||||
|
|
||||||
|
passed_count = sum(1 for result in self.test_results.values() if result)
|
||||||
|
total_count = len(self.test_results)
|
||||||
|
|
||||||
|
for test_name, result in self.test_results.items():
|
||||||
|
status = "✅ PASS" if result else "❌ FAIL"
|
||||||
|
# Get test description
|
||||||
|
temp_instance = self.test_registry[test_name](verbose=False)
|
||||||
|
description = temp_instance.test_description
|
||||||
|
print(f"📝 {description}: {status}")
|
||||||
|
|
||||||
|
print(f"\\n🎯 OVERALL RESULT: {'🎉 SUCCESS' if passed_count == total_count else '❌ FAILURE'}")
|
||||||
|
print(f"✅ {passed_count}/{total_count} tests passed")
|
||||||
|
print("=" * 70)
|
||||||
|
return passed_count == total_count
|
||||||
|
|
||||||
|
def run_full_test_suite(self, skip_docker_setup: bool = False) -> bool:
|
||||||
|
"""Run the complete test suite"""
|
||||||
|
try:
|
||||||
|
self.logger.info("🚀 Starting Gemini MCP Communication Simulator Test Suite")
|
||||||
|
|
||||||
|
# Setup
|
||||||
|
if not skip_docker_setup:
|
||||||
|
if not self.setup_test_environment():
|
||||||
|
self.logger.error("❌ Environment setup failed")
|
||||||
|
return False
|
||||||
|
else:
|
||||||
|
self.logger.info("⏩ Skipping Docker setup (containers assumed running)")
|
||||||
|
|
||||||
|
# Main simulation
|
||||||
|
if not self.simulate_claude_cli_session():
|
||||||
|
self.logger.error("❌ Claude CLI simulation failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Print comprehensive summary
|
||||||
|
overall_success = self.print_test_summary()
|
||||||
|
|
||||||
|
return overall_success
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Test suite failed: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
if not self.keep_logs and not skip_docker_setup:
|
||||||
|
self.cleanup()
|
||||||
|
|
||||||
|
def cleanup(self):
|
||||||
|
"""Cleanup test environment"""
|
||||||
|
try:
|
||||||
|
self.logger.info("🧹 Cleaning up test environment...")
|
||||||
|
|
||||||
|
if not self.keep_logs:
|
||||||
|
# Stop Docker services
|
||||||
|
self._run_command(["docker", "compose", "down", "--remove-orphans"], check=False, capture_output=True)
|
||||||
|
else:
|
||||||
|
self.logger.info("📋 Keeping Docker services running for log inspection")
|
||||||
|
|
||||||
|
# Remove temp directory
|
||||||
|
if self.temp_dir and os.path.exists(self.temp_dir):
|
||||||
|
shutil.rmtree(self.temp_dir)
|
||||||
|
self.logger.debug(f"Removed temp directory: {self.temp_dir}")
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Cleanup failed: {e}")
|
||||||
|
|
||||||
|
def _run_command(self, cmd: list[str], check: bool = True, capture_output: bool = False, **kwargs):
|
||||||
|
"""Run a shell command with logging"""
|
||||||
|
if self.verbose:
|
||||||
|
self.logger.debug(f"Running: {' '.join(cmd)}")
|
||||||
|
|
||||||
|
return subprocess.run(cmd, check=check, capture_output=capture_output, **kwargs)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_arguments():
|
||||||
|
"""Parse and validate command line arguments"""
|
||||||
|
parser = argparse.ArgumentParser(description="Gemini MCP Communication Simulator Test")
|
||||||
|
parser.add_argument("--verbose", "-v", action="store_true", help="Enable verbose logging")
|
||||||
|
parser.add_argument("--keep-logs", action="store_true", help="Keep Docker services running for log inspection")
|
||||||
|
parser.add_argument("--tests", "-t", nargs="+", help="Specific tests to run (space-separated)")
|
||||||
|
parser.add_argument("--list-tests", action="store_true", help="List available tests and exit")
|
||||||
|
parser.add_argument("--individual", "-i", help="Run a single test individually")
|
||||||
|
parser.add_argument(
|
||||||
|
"--skip-docker",
|
||||||
|
action="store_true",
|
||||||
|
default=True,
|
||||||
|
help="Skip Docker setup (assumes containers are already running) - DEFAULT",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--rebuild-docker", action="store_true", help="Force rebuild Docker environment (overrides --skip-docker)"
|
||||||
|
)
|
||||||
|
|
||||||
|
return parser.parse_args()
|
||||||
|
|
||||||
|
|
||||||
|
def list_available_tests():
|
||||||
|
"""List all available tests and exit"""
|
||||||
|
simulator = CommunicationSimulator()
|
||||||
|
print("Available tests:")
|
||||||
|
for test_name, description in simulator.get_available_tests().items():
|
||||||
|
print(f" {test_name:<25} - {description}")
|
||||||
|
|
||||||
|
|
||||||
|
def run_individual_test(simulator, test_name, skip_docker):
|
||||||
|
"""Run a single test individually"""
|
||||||
|
try:
|
||||||
|
success = simulator.run_individual_test(test_name, skip_docker_setup=skip_docker)
|
||||||
|
|
||||||
|
if success:
|
||||||
|
print(f"\\n🎉 INDIVIDUAL TEST {test_name.upper()}: PASSED")
|
||||||
|
return 0
|
||||||
|
else:
|
||||||
|
print(f"\\n❌ INDIVIDUAL TEST {test_name.upper()}: FAILED")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
print(f"\\n🛑 Individual test {test_name} interrupted by user")
|
||||||
|
if not skip_docker:
|
||||||
|
simulator.cleanup()
|
||||||
|
return 130
|
||||||
|
except Exception as e:
|
||||||
|
print(f"\\n💥 Individual test {test_name} failed with error: {e}")
|
||||||
|
if not skip_docker:
|
||||||
|
simulator.cleanup()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
|
||||||
|
def run_test_suite(simulator, skip_docker=False):
|
||||||
|
"""Run the full test suite or selected tests"""
|
||||||
|
try:
|
||||||
|
success = simulator.run_full_test_suite(skip_docker_setup=skip_docker)
|
||||||
|
|
||||||
|
if success:
|
||||||
|
print("\\n🎉 COMPREHENSIVE MCP COMMUNICATION TEST: PASSED")
|
||||||
|
return 0
|
||||||
|
else:
|
||||||
|
print("\\n❌ COMPREHENSIVE MCP COMMUNICATION TEST: FAILED")
|
||||||
|
print("⚠️ Check detailed results above")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
print("\\n🛑 Test interrupted by user")
|
||||||
|
if not skip_docker:
|
||||||
|
simulator.cleanup()
|
||||||
|
return 130
|
||||||
|
except Exception as e:
|
||||||
|
print(f"\\n💥 Unexpected error: {e}")
|
||||||
|
if not skip_docker:
|
||||||
|
simulator.cleanup()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Main entry point"""
|
||||||
|
args = parse_arguments()
|
||||||
|
|
||||||
|
# Handle list tests request
|
||||||
|
if args.list_tests:
|
||||||
|
list_available_tests()
|
||||||
|
return
|
||||||
|
|
||||||
|
# Initialize simulator consistently for all use cases
|
||||||
|
simulator = CommunicationSimulator(verbose=args.verbose, keep_logs=args.keep_logs, selected_tests=args.tests)
|
||||||
|
|
||||||
|
# Determine execution mode and run
|
||||||
|
# Override skip_docker if rebuild_docker is specified
|
||||||
|
skip_docker = args.skip_docker and not args.rebuild_docker
|
||||||
|
|
||||||
|
if args.individual:
|
||||||
|
exit_code = run_individual_test(simulator, args.individual, skip_docker)
|
||||||
|
else:
|
||||||
|
exit_code = run_test_suite(simulator, skip_docker)
|
||||||
|
|
||||||
|
sys.exit(exit_code)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
27
config.py
27
config.py
@@ -13,21 +13,23 @@ import os
|
|||||||
# Version and metadata
|
# Version and metadata
|
||||||
# These values are used in server responses and for tracking releases
|
# These values are used in server responses and for tracking releases
|
||||||
# IMPORTANT: This is the single source of truth for version and author info
|
# IMPORTANT: This is the single source of truth for version and author info
|
||||||
# setup.py imports these values to avoid duplication
|
__version__ = "3.3.0" # Semantic versioning: MAJOR.MINOR.PATCH
|
||||||
__version__ = "3.2.0" # Semantic versioning: MAJOR.MINOR.PATCH
|
__updated__ = "2025-06-11" # Last update date in ISO format
|
||||||
__updated__ = "2025-06-10" # Last update date in ISO format
|
|
||||||
__author__ = "Fahad Gilani" # Primary maintainer
|
__author__ = "Fahad Gilani" # Primary maintainer
|
||||||
|
|
||||||
# Model configuration
|
# Model configuration
|
||||||
# GEMINI_MODEL: The Gemini model used for all AI operations
|
# DEFAULT_MODEL: The default model used for all AI operations
|
||||||
# This should be a stable, high-performance model suitable for code analysis
|
# This should be a stable, high-performance model suitable for code analysis
|
||||||
GEMINI_MODEL = "gemini-2.5-pro-preview-06-05"
|
# Can be overridden by setting DEFAULT_MODEL environment variable
|
||||||
|
DEFAULT_MODEL = os.getenv("DEFAULT_MODEL", "gemini-2.5-pro-preview-06-05")
|
||||||
|
|
||||||
# MAX_CONTEXT_TOKENS: Maximum number of tokens that can be included in a single request
|
# Token allocation for Gemini Pro (1M total capacity)
|
||||||
# This limit includes both the prompt and expected response
|
# MAX_CONTEXT_TOKENS: Total model capacity
|
||||||
# Gemini Pro models support up to 1M tokens, but practical usage should reserve
|
# MAX_CONTENT_TOKENS: Available for prompts, conversation history, and files
|
||||||
# space for the model's response (typically 50K-100K tokens reserved)
|
# RESPONSE_RESERVE_TOKENS: Reserved for model response generation
|
||||||
MAX_CONTEXT_TOKENS = 1_000_000 # 1M tokens for Gemini Pro
|
MAX_CONTEXT_TOKENS = 1_000_000 # 1M tokens total capacity for Gemini Pro
|
||||||
|
MAX_CONTENT_TOKENS = 800_000 # 800K tokens for content (prompts + files + history)
|
||||||
|
RESPONSE_RESERVE_TOKENS = 200_000 # 200K tokens reserved for response generation
|
||||||
|
|
||||||
# Temperature defaults for different tool types
|
# Temperature defaults for different tool types
|
||||||
# Temperature controls the randomness/creativity of model responses
|
# Temperature controls the randomness/creativity of model responses
|
||||||
@@ -46,6 +48,11 @@ TEMPERATURE_BALANCED = 0.5 # For general chat
|
|||||||
# Used when brainstorming, exploring alternatives, or architectural discussions
|
# Used when brainstorming, exploring alternatives, or architectural discussions
|
||||||
TEMPERATURE_CREATIVE = 0.7 # For architecture, deep thinking
|
TEMPERATURE_CREATIVE = 0.7 # For architecture, deep thinking
|
||||||
|
|
||||||
|
# Thinking Mode Defaults
|
||||||
|
# DEFAULT_THINKING_MODE_THINKDEEP: Default thinking depth for extended reasoning tool
|
||||||
|
# Higher modes use more computational budget but provide deeper analysis
|
||||||
|
DEFAULT_THINKING_MODE_THINKDEEP = os.getenv("DEFAULT_THINKING_MODE_THINKDEEP", "high")
|
||||||
|
|
||||||
# MCP Protocol Limits
|
# MCP Protocol Limits
|
||||||
# MCP_PROMPT_SIZE_LIMIT: Maximum character size for prompts sent directly through MCP
|
# MCP_PROMPT_SIZE_LIMIT: Maximum character size for prompts sent directly through MCP
|
||||||
# The MCP protocol has a combined request+response limit of ~25K tokens.
|
# The MCP protocol has a combined request+response limit of ~25K tokens.
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ services:
|
|||||||
- "6379:6379"
|
- "6379:6379"
|
||||||
volumes:
|
volumes:
|
||||||
- redis_data:/data
|
- redis_data:/data
|
||||||
command: redis-server --save 60 1 --loglevel warning --maxmemory 512mb --maxmemory-policy allkeys-lru
|
command: redis-server --save 60 1 --loglevel warning --maxmemory 64mb --maxmemory-policy allkeys-lru
|
||||||
healthcheck:
|
healthcheck:
|
||||||
test: ["CMD", "redis-cli", "ping"]
|
test: ["CMD", "redis-cli", "ping"]
|
||||||
interval: 30s
|
interval: 30s
|
||||||
@@ -29,7 +29,9 @@ services:
|
|||||||
redis:
|
redis:
|
||||||
condition: service_healthy
|
condition: service_healthy
|
||||||
environment:
|
environment:
|
||||||
- GEMINI_API_KEY=${GEMINI_API_KEY}
|
- GEMINI_API_KEY=${GEMINI_API_KEY:?GEMINI_API_KEY is required. Please set it in your .env file or environment.}
|
||||||
|
- DEFAULT_MODEL=${DEFAULT_MODEL:-gemini-2.5-pro-preview-06-05}
|
||||||
|
- DEFAULT_THINKING_MODE_THINKDEEP=${DEFAULT_THINKING_MODE_THINKDEEP:-high}
|
||||||
- REDIS_URL=redis://redis:6379/0
|
- REDIS_URL=redis://redis:6379/0
|
||||||
# Use HOME not PWD: Claude needs access to any absolute file path, not just current project,
|
# Use HOME not PWD: Claude needs access to any absolute file path, not just current project,
|
||||||
# and Claude Code could be running from multiple locations at the same time
|
# and Claude Code could be running from multiple locations at the same time
|
||||||
@@ -39,6 +41,8 @@ services:
|
|||||||
volumes:
|
volumes:
|
||||||
- ${HOME:-/tmp}:/workspace:ro
|
- ${HOME:-/tmp}:/workspace:ro
|
||||||
- mcp_logs:/tmp # Shared volume for logs
|
- mcp_logs:/tmp # Shared volume for logs
|
||||||
|
- /etc/localtime:/etc/localtime:ro
|
||||||
|
- /etc/timezone:/etc/timezone:ro
|
||||||
stdin_open: true
|
stdin_open: true
|
||||||
tty: true
|
tty: true
|
||||||
entrypoint: ["python"]
|
entrypoint: ["python"]
|
||||||
@@ -55,6 +59,8 @@ services:
|
|||||||
- PYTHONUNBUFFERED=1
|
- PYTHONUNBUFFERED=1
|
||||||
volumes:
|
volumes:
|
||||||
- mcp_logs:/tmp # Shared volume for logs
|
- mcp_logs:/tmp # Shared volume for logs
|
||||||
|
- /etc/localtime:/etc/localtime:ro
|
||||||
|
- /etc/timezone:/etc/timezone:ro
|
||||||
entrypoint: ["python"]
|
entrypoint: ["python"]
|
||||||
command: ["log_monitor.py"]
|
command: ["log_monitor.py"]
|
||||||
|
|
||||||
|
|||||||
86
server.py
86
server.py
@@ -22,6 +22,7 @@ import asyncio
|
|||||||
import logging
|
import logging
|
||||||
import os
|
import os
|
||||||
import sys
|
import sys
|
||||||
|
import time
|
||||||
from datetime import datetime
|
from datetime import datetime
|
||||||
from typing import Any
|
from typing import Any
|
||||||
|
|
||||||
@@ -31,7 +32,7 @@ from mcp.server.stdio import stdio_server
|
|||||||
from mcp.types import ServerCapabilities, TextContent, Tool, ToolsCapability
|
from mcp.types import ServerCapabilities, TextContent, Tool, ToolsCapability
|
||||||
|
|
||||||
from config import (
|
from config import (
|
||||||
GEMINI_MODEL,
|
DEFAULT_MODEL,
|
||||||
MAX_CONTEXT_TOKENS,
|
MAX_CONTEXT_TOKENS,
|
||||||
__author__,
|
__author__,
|
||||||
__updated__,
|
__updated__,
|
||||||
@@ -51,6 +52,21 @@ from tools.models import ToolOutput
|
|||||||
# Can be controlled via LOG_LEVEL environment variable (DEBUG, INFO, WARNING, ERROR)
|
# Can be controlled via LOG_LEVEL environment variable (DEBUG, INFO, WARNING, ERROR)
|
||||||
log_level = os.getenv("LOG_LEVEL", "INFO").upper()
|
log_level = os.getenv("LOG_LEVEL", "INFO").upper()
|
||||||
|
|
||||||
|
# Create timezone-aware formatter
|
||||||
|
|
||||||
|
|
||||||
|
class LocalTimeFormatter(logging.Formatter):
|
||||||
|
def formatTime(self, record, datefmt=None):
|
||||||
|
"""Override to use local timezone instead of UTC"""
|
||||||
|
ct = self.converter(record.created)
|
||||||
|
if datefmt:
|
||||||
|
s = time.strftime(datefmt, ct)
|
||||||
|
else:
|
||||||
|
t = time.strftime("%Y-%m-%d %H:%M:%S", ct)
|
||||||
|
s = f"{t},{record.msecs:03.0f}"
|
||||||
|
return s
|
||||||
|
|
||||||
|
|
||||||
# Configure both console and file logging
|
# Configure both console and file logging
|
||||||
log_format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
log_format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||||
logging.basicConfig(
|
logging.basicConfig(
|
||||||
@@ -60,18 +76,22 @@ logging.basicConfig(
|
|||||||
stream=sys.stderr, # Use stderr to avoid interfering with MCP stdin/stdout protocol
|
stream=sys.stderr, # Use stderr to avoid interfering with MCP stdin/stdout protocol
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Apply local time formatter to root logger
|
||||||
|
for handler in logging.getLogger().handlers:
|
||||||
|
handler.setFormatter(LocalTimeFormatter(log_format))
|
||||||
|
|
||||||
# Add file handler for Docker log monitoring
|
# Add file handler for Docker log monitoring
|
||||||
try:
|
try:
|
||||||
file_handler = logging.FileHandler("/tmp/mcp_server.log")
|
file_handler = logging.FileHandler("/tmp/mcp_server.log")
|
||||||
file_handler.setLevel(getattr(logging, log_level, logging.INFO))
|
file_handler.setLevel(getattr(logging, log_level, logging.INFO))
|
||||||
file_handler.setFormatter(logging.Formatter(log_format))
|
file_handler.setFormatter(LocalTimeFormatter(log_format))
|
||||||
logging.getLogger().addHandler(file_handler)
|
logging.getLogger().addHandler(file_handler)
|
||||||
|
|
||||||
# Create a special logger for MCP activity tracking
|
# Create a special logger for MCP activity tracking
|
||||||
mcp_logger = logging.getLogger("mcp_activity")
|
mcp_logger = logging.getLogger("mcp_activity")
|
||||||
mcp_file_handler = logging.FileHandler("/tmp/mcp_activity.log")
|
mcp_file_handler = logging.FileHandler("/tmp/mcp_activity.log")
|
||||||
mcp_file_handler.setLevel(logging.INFO)
|
mcp_file_handler.setLevel(logging.INFO)
|
||||||
mcp_file_handler.setFormatter(logging.Formatter("%(asctime)s - %(message)s"))
|
mcp_file_handler.setFormatter(LocalTimeFormatter("%(asctime)s - %(message)s"))
|
||||||
mcp_logger.addHandler(mcp_file_handler)
|
mcp_logger.addHandler(mcp_file_handler)
|
||||||
mcp_logger.setLevel(logging.INFO)
|
mcp_logger.setLevel(logging.INFO)
|
||||||
|
|
||||||
@@ -196,6 +216,10 @@ async def handle_call_tool(name: str, arguments: dict[str, Any]) -> list[TextCon
|
|||||||
if "continuation_id" in arguments and arguments["continuation_id"]:
|
if "continuation_id" in arguments and arguments["continuation_id"]:
|
||||||
continuation_id = arguments["continuation_id"]
|
continuation_id = arguments["continuation_id"]
|
||||||
logger.debug(f"Resuming conversation thread: {continuation_id}")
|
logger.debug(f"Resuming conversation thread: {continuation_id}")
|
||||||
|
logger.debug(
|
||||||
|
f"[CONVERSATION_DEBUG] Tool '{name}' resuming thread {continuation_id} with {len(arguments)} arguments"
|
||||||
|
)
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Original arguments keys: {list(arguments.keys())}")
|
||||||
|
|
||||||
# Log to activity file for monitoring
|
# Log to activity file for monitoring
|
||||||
try:
|
try:
|
||||||
@@ -205,6 +229,9 @@ async def handle_call_tool(name: str, arguments: dict[str, Any]) -> list[TextCon
|
|||||||
pass
|
pass
|
||||||
|
|
||||||
arguments = await reconstruct_thread_context(arguments)
|
arguments = await reconstruct_thread_context(arguments)
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] After thread reconstruction, arguments keys: {list(arguments.keys())}")
|
||||||
|
if "_remaining_tokens" in arguments:
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Remaining token budget: {arguments['_remaining_tokens']:,}")
|
||||||
|
|
||||||
# Route to AI-powered tools that require Gemini API calls
|
# Route to AI-powered tools that require Gemini API calls
|
||||||
if name in TOOLS:
|
if name in TOOLS:
|
||||||
@@ -300,9 +327,11 @@ async def reconstruct_thread_context(arguments: dict[str, Any]) -> dict[str, Any
|
|||||||
continuation_id = arguments["continuation_id"]
|
continuation_id = arguments["continuation_id"]
|
||||||
|
|
||||||
# Get thread context from Redis
|
# Get thread context from Redis
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Looking up thread {continuation_id} in Redis")
|
||||||
context = get_thread(continuation_id)
|
context = get_thread(continuation_id)
|
||||||
if not context:
|
if not context:
|
||||||
logger.warning(f"Thread not found: {continuation_id}")
|
logger.warning(f"Thread not found: {continuation_id}")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Thread {continuation_id} not found in Redis or expired")
|
||||||
|
|
||||||
# Log to activity file for monitoring
|
# Log to activity file for monitoring
|
||||||
try:
|
try:
|
||||||
@@ -324,15 +353,26 @@ async def reconstruct_thread_context(arguments: dict[str, Any]) -> dict[str, Any
|
|||||||
if user_prompt:
|
if user_prompt:
|
||||||
# Capture files referenced in this turn
|
# Capture files referenced in this turn
|
||||||
user_files = arguments.get("files", [])
|
user_files = arguments.get("files", [])
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Adding user turn to thread {continuation_id}")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] User prompt length: {len(user_prompt)} chars")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] User files: {user_files}")
|
||||||
success = add_turn(continuation_id, "user", user_prompt, files=user_files)
|
success = add_turn(continuation_id, "user", user_prompt, files=user_files)
|
||||||
if not success:
|
if not success:
|
||||||
logger.warning(f"Failed to add user turn to thread {continuation_id}")
|
logger.warning(f"Failed to add user turn to thread {continuation_id}")
|
||||||
|
logger.debug("[CONVERSATION_DEBUG] Failed to add user turn - thread may be at turn limit or expired")
|
||||||
|
else:
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Successfully added user turn to thread {continuation_id}")
|
||||||
|
|
||||||
# Build conversation history
|
# Build conversation history and track token usage
|
||||||
conversation_history = build_conversation_history(context)
|
logger.debug(f"[CONVERSATION_DEBUG] Building conversation history for thread {continuation_id}")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Thread has {len(context.turns)} turns, tool: {context.tool_name}")
|
||||||
|
conversation_history, conversation_tokens = build_conversation_history(context)
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Conversation history built: {conversation_tokens:,} tokens")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Conversation history length: {len(conversation_history)} chars")
|
||||||
|
|
||||||
# Add dynamic follow-up instructions based on turn count
|
# Add dynamic follow-up instructions based on turn count
|
||||||
follow_up_instructions = get_follow_up_instructions(len(context.turns))
|
follow_up_instructions = get_follow_up_instructions(len(context.turns))
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Follow-up instructions added for turn {len(context.turns)}")
|
||||||
|
|
||||||
# Merge original context with new prompt and follow-up instructions
|
# Merge original context with new prompt and follow-up instructions
|
||||||
original_prompt = arguments.get("prompt", "")
|
original_prompt = arguments.get("prompt", "")
|
||||||
@@ -343,17 +383,34 @@ async def reconstruct_thread_context(arguments: dict[str, Any]) -> dict[str, Any
|
|||||||
else:
|
else:
|
||||||
enhanced_prompt = f"{original_prompt}\n\n{follow_up_instructions}"
|
enhanced_prompt = f"{original_prompt}\n\n{follow_up_instructions}"
|
||||||
|
|
||||||
# Update arguments with enhanced context
|
# Update arguments with enhanced context and remaining token budget
|
||||||
enhanced_arguments = arguments.copy()
|
enhanced_arguments = arguments.copy()
|
||||||
enhanced_arguments["prompt"] = enhanced_prompt
|
enhanced_arguments["prompt"] = enhanced_prompt
|
||||||
|
|
||||||
|
# Calculate remaining token budget for current request files/content
|
||||||
|
from config import MAX_CONTENT_TOKENS
|
||||||
|
|
||||||
|
remaining_tokens = MAX_CONTENT_TOKENS - conversation_tokens
|
||||||
|
enhanced_arguments["_remaining_tokens"] = max(0, remaining_tokens) # Ensure non-negative
|
||||||
|
logger.debug("[CONVERSATION_DEBUG] Token budget calculation:")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] MAX_CONTENT_TOKENS: {MAX_CONTENT_TOKENS:,}")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Conversation tokens: {conversation_tokens:,}")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Remaining tokens: {remaining_tokens:,}")
|
||||||
|
|
||||||
# Merge original context parameters (files, etc.) with new request
|
# Merge original context parameters (files, etc.) with new request
|
||||||
if context.initial_context:
|
if context.initial_context:
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Merging initial context with {len(context.initial_context)} parameters")
|
||||||
for key, value in context.initial_context.items():
|
for key, value in context.initial_context.items():
|
||||||
if key not in enhanced_arguments and key not in ["temperature", "thinking_mode", "model"]:
|
if key not in enhanced_arguments and key not in ["temperature", "thinking_mode", "model"]:
|
||||||
enhanced_arguments[key] = value
|
enhanced_arguments[key] = value
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Merged initial context param: {key}")
|
||||||
|
|
||||||
logger.info(f"Reconstructed context for thread {continuation_id} (turn {len(context.turns)})")
|
logger.info(f"Reconstructed context for thread {continuation_id} (turn {len(context.turns)})")
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Final enhanced arguments keys: {list(enhanced_arguments.keys())}")
|
||||||
|
|
||||||
|
# Debug log files in the enhanced arguments for file tracking
|
||||||
|
if "files" in enhanced_arguments:
|
||||||
|
logger.debug(f"[CONVERSATION_DEBUG] Final files in enhanced arguments: {enhanced_arguments['files']}")
|
||||||
|
|
||||||
# Log to activity file for monitoring
|
# Log to activity file for monitoring
|
||||||
try:
|
try:
|
||||||
@@ -378,12 +435,16 @@ async def handle_get_version() -> list[TextContent]:
|
|||||||
Returns:
|
Returns:
|
||||||
Formatted text with version and configuration details
|
Formatted text with version and configuration details
|
||||||
"""
|
"""
|
||||||
|
# Import thinking mode here to avoid circular imports
|
||||||
|
from config import DEFAULT_THINKING_MODE_THINKDEEP
|
||||||
|
|
||||||
# Gather comprehensive server information
|
# Gather comprehensive server information
|
||||||
version_info = {
|
version_info = {
|
||||||
"version": __version__,
|
"version": __version__,
|
||||||
"updated": __updated__,
|
"updated": __updated__,
|
||||||
"author": __author__,
|
"author": __author__,
|
||||||
"gemini_model": GEMINI_MODEL,
|
"default_model": DEFAULT_MODEL,
|
||||||
|
"default_thinking_mode_thinkdeep": DEFAULT_THINKING_MODE_THINKDEEP,
|
||||||
"max_context_tokens": f"{MAX_CONTEXT_TOKENS:,}",
|
"max_context_tokens": f"{MAX_CONTEXT_TOKENS:,}",
|
||||||
"python_version": f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}",
|
"python_version": f"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}",
|
||||||
"server_started": datetime.now().isoformat(),
|
"server_started": datetime.now().isoformat(),
|
||||||
@@ -396,7 +457,8 @@ Updated: {__updated__}
|
|||||||
Author: {__author__}
|
Author: {__author__}
|
||||||
|
|
||||||
Configuration:
|
Configuration:
|
||||||
- Gemini Model: {GEMINI_MODEL}
|
- Default Model: {DEFAULT_MODEL}
|
||||||
|
- Default Thinking Mode (ThinkDeep): {DEFAULT_THINKING_MODE_THINKDEEP}
|
||||||
- Max Context: {MAX_CONTEXT_TOKENS:,} tokens
|
- Max Context: {MAX_CONTEXT_TOKENS:,} tokens
|
||||||
- Python: {version_info["python_version"]}
|
- Python: {version_info["python_version"]}
|
||||||
- Started: {version_info["server_started"]}
|
- Started: {version_info["server_started"]}
|
||||||
@@ -429,7 +491,13 @@ async def main():
|
|||||||
# Log startup message for Docker log monitoring
|
# Log startup message for Docker log monitoring
|
||||||
logger.info("Gemini MCP Server starting up...")
|
logger.info("Gemini MCP Server starting up...")
|
||||||
logger.info(f"Log level: {log_level}")
|
logger.info(f"Log level: {log_level}")
|
||||||
logger.info(f"Using model: {GEMINI_MODEL}")
|
logger.info(f"Using default model: {DEFAULT_MODEL}")
|
||||||
|
|
||||||
|
# Import here to avoid circular imports
|
||||||
|
from config import DEFAULT_THINKING_MODE_THINKDEEP
|
||||||
|
|
||||||
|
logger.info(f"Default thinking mode (ThinkDeep): {DEFAULT_THINKING_MODE_THINKDEEP}")
|
||||||
|
|
||||||
logger.info(f"Available tools: {list(TOOLS.keys())}")
|
logger.info(f"Available tools: {list(TOOLS.keys())}")
|
||||||
logger.info("Server ready - waiting for tool requests...")
|
logger.info("Server ready - waiting for tool requests...")
|
||||||
|
|
||||||
|
|||||||
@@ -17,41 +17,34 @@ if [ -f .env ]; then
|
|||||||
echo "⚠️ .env file already exists! Updating if needed..."
|
echo "⚠️ .env file already exists! Updating if needed..."
|
||||||
echo ""
|
echo ""
|
||||||
else
|
else
|
||||||
# Check if GEMINI_API_KEY is already set in environment
|
# Copy from .env.example and customize
|
||||||
if [ -n "$GEMINI_API_KEY" ]; then
|
if [ ! -f .env.example ]; then
|
||||||
API_KEY_VALUE="$GEMINI_API_KEY"
|
echo "❌ .env.example file not found! This file should exist in the project directory."
|
||||||
echo "✅ Found existing GEMINI_API_KEY in environment"
|
exit 1
|
||||||
else
|
|
||||||
API_KEY_VALUE="your-gemini-api-key-here"
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Create the .env file
|
# Copy .env.example to .env
|
||||||
cat > .env << EOF
|
cp .env.example .env
|
||||||
# Gemini MCP Server Docker Environment Configuration
|
echo "✅ Created .env from .env.example"
|
||||||
# Generated on $(date)
|
|
||||||
|
# Customize the API key if it's set in environment
|
||||||
# Your Gemini API key (get one from https://makersuite.google.com/app/apikey)
|
if [ -n "$GEMINI_API_KEY" ]; then
|
||||||
# IMPORTANT: Replace this with your actual API key
|
# Replace the placeholder API key with the actual value
|
||||||
GEMINI_API_KEY=$API_KEY_VALUE
|
if command -v sed >/dev/null 2>&1; then
|
||||||
|
sed -i.bak "s/your_gemini_api_key_here/$GEMINI_API_KEY/" .env && rm .env.bak
|
||||||
# Redis configuration (automatically set for Docker Compose)
|
echo "✅ Updated .env with existing GEMINI_API_KEY from environment"
|
||||||
REDIS_URL=redis://redis:6379/0
|
else
|
||||||
|
echo "⚠️ Found GEMINI_API_KEY in environment, but sed not available. Please update .env manually."
|
||||||
# Workspace root - host path that maps to /workspace in container
|
fi
|
||||||
# This should be the host directory path that contains all files Claude might reference
|
else
|
||||||
# We use $HOME (not $PWD) because Claude needs access to ANY absolute file path,
|
echo "⚠️ GEMINI_API_KEY not found in environment. Please edit .env and add your API key."
|
||||||
# not just files within the current project directory. Additionally, Claude Code
|
fi
|
||||||
# could be running from multiple locations at the same time.
|
|
||||||
WORKSPACE_ROOT=$HOME
|
# Update WORKSPACE_ROOT to use current user's home directory
|
||||||
|
if command -v sed >/dev/null 2>&1; then
|
||||||
# Logging level (DEBUG, INFO, WARNING, ERROR)
|
sed -i.bak "s|WORKSPACE_ROOT=/Users/your-username|WORKSPACE_ROOT=$HOME|" .env && rm .env.bak
|
||||||
# DEBUG: Shows detailed operational messages, conversation threading, tool execution flow
|
echo "✅ Updated WORKSPACE_ROOT to $HOME"
|
||||||
# INFO: Shows general operational messages (default)
|
fi
|
||||||
# WARNING: Shows only warnings and errors
|
|
||||||
# ERROR: Shows only errors
|
|
||||||
# Uncomment and change to DEBUG if you need detailed troubleshooting information
|
|
||||||
LOG_LEVEL=INFO
|
|
||||||
EOF
|
|
||||||
echo "✅ Created .env file with Redis configuration"
|
echo "✅ Created .env file with Redis configuration"
|
||||||
echo ""
|
echo ""
|
||||||
fi
|
fi
|
||||||
|
|||||||
52
setup.py
52
setup.py
@@ -1,52 +0,0 @@
|
|||||||
"""
|
|
||||||
Setup configuration for Gemini MCP Server
|
|
||||||
"""
|
|
||||||
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from setuptools import setup
|
|
||||||
|
|
||||||
# Import version and author from config to maintain single source of truth
|
|
||||||
from config import __author__, __version__
|
|
||||||
|
|
||||||
# Read README for long description
|
|
||||||
readme_path = Path(__file__).parent / "README.md"
|
|
||||||
long_description = ""
|
|
||||||
if readme_path.exists():
|
|
||||||
long_description = readme_path.read_text(encoding="utf-8")
|
|
||||||
|
|
||||||
setup(
|
|
||||||
name="gemini-mcp-server",
|
|
||||||
version=__version__,
|
|
||||||
description="Model Context Protocol server for Google Gemini",
|
|
||||||
long_description=long_description,
|
|
||||||
long_description_content_type="text/markdown",
|
|
||||||
author=__author__,
|
|
||||||
python_requires=">=3.10",
|
|
||||||
py_modules=["gemini_server"],
|
|
||||||
install_requires=[
|
|
||||||
"mcp>=1.0.0",
|
|
||||||
"google-genai>=1.19.0",
|
|
||||||
"pydantic>=2.0.0",
|
|
||||||
],
|
|
||||||
extras_require={
|
|
||||||
"dev": [
|
|
||||||
"pytest>=7.4.0",
|
|
||||||
"pytest-asyncio>=0.21.0",
|
|
||||||
"pytest-mock>=3.11.0",
|
|
||||||
]
|
|
||||||
},
|
|
||||||
entry_points={
|
|
||||||
"console_scripts": [
|
|
||||||
"gemini-mcp-server=gemini_server:main",
|
|
||||||
],
|
|
||||||
},
|
|
||||||
classifiers=[
|
|
||||||
"Development Status :: 4 - Beta",
|
|
||||||
"Intended Audience :: Developers",
|
|
||||||
"Programming Language :: Python :: 3",
|
|
||||||
"Programming Language :: Python :: 3.10",
|
|
||||||
"Programming Language :: Python :: 3.11",
|
|
||||||
"Programming Language :: Python :: 3.12",
|
|
||||||
],
|
|
||||||
)
|
|
||||||
41
simulator_tests/__init__.py
Normal file
41
simulator_tests/__init__.py
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
"""
|
||||||
|
Communication Simulator Tests Package
|
||||||
|
|
||||||
|
This package contains individual test modules for the Gemini MCP Communication Simulator.
|
||||||
|
Each test is in its own file for better organization and maintainability.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
from .test_basic_conversation import BasicConversationTest
|
||||||
|
from .test_content_validation import ContentValidationTest
|
||||||
|
from .test_cross_tool_comprehensive import CrossToolComprehensiveTest
|
||||||
|
from .test_cross_tool_continuation import CrossToolContinuationTest
|
||||||
|
from .test_logs_validation import LogsValidationTest
|
||||||
|
from .test_model_thinking_config import TestModelThinkingConfig
|
||||||
|
from .test_per_tool_deduplication import PerToolDeduplicationTest
|
||||||
|
from .test_redis_validation import RedisValidationTest
|
||||||
|
|
||||||
|
# Test registry for dynamic loading
|
||||||
|
TEST_REGISTRY = {
|
||||||
|
"basic_conversation": BasicConversationTest,
|
||||||
|
"content_validation": ContentValidationTest,
|
||||||
|
"per_tool_deduplication": PerToolDeduplicationTest,
|
||||||
|
"cross_tool_continuation": CrossToolContinuationTest,
|
||||||
|
"cross_tool_comprehensive": CrossToolComprehensiveTest,
|
||||||
|
"logs_validation": LogsValidationTest,
|
||||||
|
"redis_validation": RedisValidationTest,
|
||||||
|
"model_thinking_config": TestModelThinkingConfig,
|
||||||
|
}
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
"BaseSimulatorTest",
|
||||||
|
"BasicConversationTest",
|
||||||
|
"ContentValidationTest",
|
||||||
|
"PerToolDeduplicationTest",
|
||||||
|
"CrossToolContinuationTest",
|
||||||
|
"CrossToolComprehensiveTest",
|
||||||
|
"LogsValidationTest",
|
||||||
|
"RedisValidationTest",
|
||||||
|
"TestModelThinkingConfig",
|
||||||
|
"TEST_REGISTRY",
|
||||||
|
]
|
||||||
266
simulator_tests/base_test.py
Normal file
266
simulator_tests/base_test.py
Normal file
@@ -0,0 +1,266 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Base Test Class for Communication Simulator Tests
|
||||||
|
|
||||||
|
Provides common functionality and utilities for all simulator tests.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
|
||||||
|
class BaseSimulatorTest:
|
||||||
|
"""Base class for all communication simulator tests"""
|
||||||
|
|
||||||
|
def __init__(self, verbose: bool = False):
|
||||||
|
self.verbose = verbose
|
||||||
|
self.test_files = {}
|
||||||
|
self.test_dir = None
|
||||||
|
self.container_name = "gemini-mcp-server"
|
||||||
|
self.redis_container = "gemini-mcp-redis"
|
||||||
|
|
||||||
|
# Configure logging
|
||||||
|
log_level = logging.DEBUG if verbose else logging.INFO
|
||||||
|
logging.basicConfig(level=log_level, format="%(asctime)s - %(levelname)s - %(message)s")
|
||||||
|
self.logger = logging.getLogger(self.__class__.__name__)
|
||||||
|
|
||||||
|
def setup_test_files(self):
|
||||||
|
"""Create test files for the simulation"""
|
||||||
|
# Test Python file
|
||||||
|
python_content = '''"""
|
||||||
|
Sample Python module for testing MCP conversation continuity
|
||||||
|
"""
|
||||||
|
|
||||||
|
def fibonacci(n):
|
||||||
|
"""Calculate fibonacci number recursively"""
|
||||||
|
if n <= 1:
|
||||||
|
return n
|
||||||
|
return fibonacci(n-1) + fibonacci(n-2)
|
||||||
|
|
||||||
|
def factorial(n):
|
||||||
|
"""Calculate factorial iteratively"""
|
||||||
|
result = 1
|
||||||
|
for i in range(1, n + 1):
|
||||||
|
result *= i
|
||||||
|
return result
|
||||||
|
|
||||||
|
class Calculator:
|
||||||
|
"""Simple calculator class"""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.history = []
|
||||||
|
|
||||||
|
def add(self, a, b):
|
||||||
|
result = a + b
|
||||||
|
self.history.append(f"{a} + {b} = {result}")
|
||||||
|
return result
|
||||||
|
|
||||||
|
def multiply(self, a, b):
|
||||||
|
result = a * b
|
||||||
|
self.history.append(f"{a} * {b} = {result}")
|
||||||
|
return result
|
||||||
|
'''
|
||||||
|
|
||||||
|
# Test configuration file
|
||||||
|
config_content = """{
|
||||||
|
"database": {
|
||||||
|
"host": "localhost",
|
||||||
|
"port": 5432,
|
||||||
|
"name": "testdb",
|
||||||
|
"ssl": true
|
||||||
|
},
|
||||||
|
"cache": {
|
||||||
|
"redis_url": "redis://localhost:6379",
|
||||||
|
"ttl": 3600
|
||||||
|
},
|
||||||
|
"logging": {
|
||||||
|
"level": "INFO",
|
||||||
|
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
|
||||||
|
}
|
||||||
|
}"""
|
||||||
|
|
||||||
|
# Create files in the current project directory
|
||||||
|
current_dir = os.getcwd()
|
||||||
|
self.test_dir = os.path.join(current_dir, "test_simulation_files")
|
||||||
|
os.makedirs(self.test_dir, exist_ok=True)
|
||||||
|
|
||||||
|
test_py = os.path.join(self.test_dir, "test_module.py")
|
||||||
|
test_config = os.path.join(self.test_dir, "config.json")
|
||||||
|
|
||||||
|
with open(test_py, "w") as f:
|
||||||
|
f.write(python_content)
|
||||||
|
with open(test_config, "w") as f:
|
||||||
|
f.write(config_content)
|
||||||
|
|
||||||
|
# Ensure absolute paths for MCP server compatibility
|
||||||
|
self.test_files = {"python": os.path.abspath(test_py), "config": os.path.abspath(test_config)}
|
||||||
|
self.logger.debug(f"Created test files with absolute paths: {list(self.test_files.values())}")
|
||||||
|
|
||||||
|
def call_mcp_tool(self, tool_name: str, params: dict) -> tuple[Optional[str], Optional[str]]:
|
||||||
|
"""Call an MCP tool via Claude CLI (docker exec)"""
|
||||||
|
try:
|
||||||
|
# Prepare the MCP initialization and tool call sequence
|
||||||
|
init_request = {
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": 1,
|
||||||
|
"method": "initialize",
|
||||||
|
"params": {
|
||||||
|
"protocolVersion": "2024-11-05",
|
||||||
|
"capabilities": {"tools": {}},
|
||||||
|
"clientInfo": {"name": "communication-simulator", "version": "1.0.0"},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
# Send initialized notification
|
||||||
|
initialized_notification = {"jsonrpc": "2.0", "method": "notifications/initialized"}
|
||||||
|
|
||||||
|
# Prepare the tool call request
|
||||||
|
tool_request = {
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": 2,
|
||||||
|
"method": "tools/call",
|
||||||
|
"params": {"name": tool_name, "arguments": params},
|
||||||
|
}
|
||||||
|
|
||||||
|
# Combine all messages
|
||||||
|
messages = [json.dumps(init_request), json.dumps(initialized_notification), json.dumps(tool_request)]
|
||||||
|
|
||||||
|
# Join with newlines as MCP expects
|
||||||
|
input_data = "\n".join(messages) + "\n"
|
||||||
|
|
||||||
|
# Simulate Claude CLI calling the MCP server via docker exec
|
||||||
|
docker_cmd = ["docker", "exec", "-i", self.container_name, "python", "server.py"]
|
||||||
|
|
||||||
|
self.logger.debug(f"Calling MCP tool {tool_name} with proper initialization")
|
||||||
|
|
||||||
|
# Execute the command
|
||||||
|
result = subprocess.run(
|
||||||
|
docker_cmd, input=input_data, text=True, capture_output=True, timeout=3600 # 1 hour timeout
|
||||||
|
)
|
||||||
|
|
||||||
|
if result.returncode != 0:
|
||||||
|
self.logger.error(f"Docker exec failed: {result.stderr}")
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
# Parse the response - look for the tool call response
|
||||||
|
response_data = self._parse_mcp_response(result.stdout, expected_id=2)
|
||||||
|
if not response_data:
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
# Extract continuation_id if present
|
||||||
|
continuation_id = self._extract_continuation_id(response_data)
|
||||||
|
|
||||||
|
return response_data, continuation_id
|
||||||
|
|
||||||
|
except subprocess.TimeoutExpired:
|
||||||
|
self.logger.error(f"MCP tool call timed out after 1 hour: {tool_name}")
|
||||||
|
return None, None
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"MCP tool call failed: {e}")
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
def _parse_mcp_response(self, stdout: str, expected_id: int = 2) -> Optional[str]:
|
||||||
|
"""Parse MCP JSON-RPC response from stdout"""
|
||||||
|
try:
|
||||||
|
lines = stdout.strip().split("\n")
|
||||||
|
for line in lines:
|
||||||
|
if line.strip() and line.startswith("{"):
|
||||||
|
response = json.loads(line)
|
||||||
|
# Look for the tool call response with the expected ID
|
||||||
|
if response.get("id") == expected_id and "result" in response:
|
||||||
|
# Extract the actual content from the response
|
||||||
|
result = response["result"]
|
||||||
|
# Handle new response format with 'content' array
|
||||||
|
if isinstance(result, dict) and "content" in result:
|
||||||
|
content_array = result["content"]
|
||||||
|
if isinstance(content_array, list) and len(content_array) > 0:
|
||||||
|
return content_array[0].get("text", "")
|
||||||
|
# Handle legacy format
|
||||||
|
elif isinstance(result, list) and len(result) > 0:
|
||||||
|
return result[0].get("text", "")
|
||||||
|
elif response.get("id") == expected_id and "error" in response:
|
||||||
|
self.logger.error(f"MCP error: {response['error']}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
# If we get here, log all responses for debugging
|
||||||
|
self.logger.warning(f"No valid tool call response found for ID {expected_id}")
|
||||||
|
self.logger.debug(f"Full stdout: {stdout}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
self.logger.error(f"Failed to parse MCP response: {e}")
|
||||||
|
self.logger.debug(f"Stdout that failed to parse: {stdout}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _extract_continuation_id(self, response_text: str) -> Optional[str]:
|
||||||
|
"""Extract continuation_id from response metadata"""
|
||||||
|
try:
|
||||||
|
# Parse the response text as JSON to look for continuation metadata
|
||||||
|
response_data = json.loads(response_text)
|
||||||
|
|
||||||
|
# Look for continuation_id in various places
|
||||||
|
if isinstance(response_data, dict):
|
||||||
|
# Check metadata
|
||||||
|
metadata = response_data.get("metadata", {})
|
||||||
|
if "thread_id" in metadata:
|
||||||
|
return metadata["thread_id"]
|
||||||
|
|
||||||
|
# Check follow_up_request
|
||||||
|
follow_up = response_data.get("follow_up_request", {})
|
||||||
|
if follow_up and "continuation_id" in follow_up:
|
||||||
|
return follow_up["continuation_id"]
|
||||||
|
|
||||||
|
# Check continuation_offer
|
||||||
|
continuation_offer = response_data.get("continuation_offer", {})
|
||||||
|
if continuation_offer and "continuation_id" in continuation_offer:
|
||||||
|
return continuation_offer["continuation_id"]
|
||||||
|
|
||||||
|
self.logger.debug(f"No continuation_id found in response: {response_data}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
except json.JSONDecodeError as e:
|
||||||
|
self.logger.debug(f"Failed to parse response for continuation_id: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def run_command(self, cmd: list[str], check: bool = True, capture_output: bool = False, **kwargs):
|
||||||
|
"""Run a shell command with logging"""
|
||||||
|
if self.verbose:
|
||||||
|
self.logger.debug(f"Running: {' '.join(cmd)}")
|
||||||
|
|
||||||
|
return subprocess.run(cmd, check=check, capture_output=capture_output, **kwargs)
|
||||||
|
|
||||||
|
def create_additional_test_file(self, filename: str, content: str) -> str:
|
||||||
|
"""Create an additional test file for mixed scenario testing"""
|
||||||
|
if not hasattr(self, "test_dir") or not self.test_dir:
|
||||||
|
raise RuntimeError("Test directory not initialized. Call setup_test_files() first.")
|
||||||
|
|
||||||
|
file_path = os.path.join(self.test_dir, filename)
|
||||||
|
with open(file_path, "w") as f:
|
||||||
|
f.write(content)
|
||||||
|
# Return absolute path for MCP server compatibility
|
||||||
|
return os.path.abspath(file_path)
|
||||||
|
|
||||||
|
def cleanup_test_files(self):
|
||||||
|
"""Clean up test files"""
|
||||||
|
if hasattr(self, "test_dir") and self.test_dir and os.path.exists(self.test_dir):
|
||||||
|
import shutil
|
||||||
|
|
||||||
|
shutil.rmtree(self.test_dir)
|
||||||
|
self.logger.debug(f"Removed test files directory: {self.test_dir}")
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Run the test - to be implemented by subclasses"""
|
||||||
|
raise NotImplementedError("Subclasses must implement run_test()")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
"""Get the test name - to be implemented by subclasses"""
|
||||||
|
raise NotImplementedError("Subclasses must implement test_name property")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
"""Get the test description - to be implemented by subclasses"""
|
||||||
|
raise NotImplementedError("Subclasses must implement test_description property")
|
||||||
86
simulator_tests/test_basic_conversation.py
Normal file
86
simulator_tests/test_basic_conversation.py
Normal file
@@ -0,0 +1,86 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Basic Conversation Flow Test
|
||||||
|
|
||||||
|
Tests basic conversation continuity with the chat tool, including:
|
||||||
|
- Initial chat with file analysis
|
||||||
|
- Continuing conversation with same file (deduplication)
|
||||||
|
- Adding additional files to ongoing conversation
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class BasicConversationTest(BaseSimulatorTest):
|
||||||
|
"""Test basic conversation flow with chat tool"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "basic_conversation"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Basic conversation flow with chat tool"
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Test basic conversation flow with chat tool"""
|
||||||
|
try:
|
||||||
|
self.logger.info("📝 Test: Basic conversation flow")
|
||||||
|
|
||||||
|
# Setup test files
|
||||||
|
self.setup_test_files()
|
||||||
|
|
||||||
|
# Initial chat tool call with file
|
||||||
|
self.logger.info(" 1.1: Initial chat with file analysis")
|
||||||
|
response1, continuation_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Analyze this Python code and explain what it does",
|
||||||
|
"files": [self.test_files["python"]],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response1 or not continuation_id:
|
||||||
|
self.logger.error("Failed to get initial response with continuation_id")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(f" ✅ Got continuation_id: {continuation_id}")
|
||||||
|
|
||||||
|
# Continue conversation with same file (should be deduplicated)
|
||||||
|
self.logger.info(" 1.2: Continue conversation with same file")
|
||||||
|
response2, _ = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Now focus on the Calculator class specifically. Are there any improvements you'd suggest?",
|
||||||
|
"files": [self.test_files["python"]], # Same file - should be deduplicated
|
||||||
|
"continuation_id": continuation_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response2:
|
||||||
|
self.logger.error("Failed to continue conversation")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Continue with additional file
|
||||||
|
self.logger.info(" 1.3: Continue conversation with additional file")
|
||||||
|
response3, _ = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Now also analyze this configuration file and see how it might relate to the Python code",
|
||||||
|
"files": [self.test_files["python"], self.test_files["config"]],
|
||||||
|
"continuation_id": continuation_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response3:
|
||||||
|
self.logger.error("Failed to continue with additional file")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ Basic conversation flow working")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Basic conversation flow test failed: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
self.cleanup_test_files()
|
||||||
197
simulator_tests/test_content_validation.py
Normal file
197
simulator_tests/test_content_validation.py
Normal file
@@ -0,0 +1,197 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Content Validation Test
|
||||||
|
|
||||||
|
Tests that tools don't duplicate file content in their responses.
|
||||||
|
This test is specifically designed to catch content duplication bugs.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class ContentValidationTest(BaseSimulatorTest):
|
||||||
|
"""Test that tools don't duplicate file content in their responses"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "content_validation"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Content validation and duplicate detection"
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Test that tools don't duplicate file content in their responses"""
|
||||||
|
try:
|
||||||
|
self.logger.info("📄 Test: Content validation and duplicate detection")
|
||||||
|
|
||||||
|
# Setup test files first
|
||||||
|
self.setup_test_files()
|
||||||
|
|
||||||
|
# Create a test file with distinctive content for validation
|
||||||
|
validation_content = '''"""
|
||||||
|
Configuration file for content validation testing
|
||||||
|
This content should appear only ONCE in any tool response
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Configuration constants
|
||||||
|
MAX_CONTENT_TOKENS = 800_000 # This line should appear exactly once
|
||||||
|
TEMPERATURE_ANALYTICAL = 0.2 # This should also appear exactly once
|
||||||
|
UNIQUE_VALIDATION_MARKER = "CONTENT_VALIDATION_TEST_12345"
|
||||||
|
|
||||||
|
# Database settings
|
||||||
|
DATABASE_CONFIG = {
|
||||||
|
"host": "localhost",
|
||||||
|
"port": 5432,
|
||||||
|
"name": "validation_test_db"
|
||||||
|
}
|
||||||
|
'''
|
||||||
|
|
||||||
|
validation_file = os.path.join(self.test_dir, "validation_config.py")
|
||||||
|
with open(validation_file, "w") as f:
|
||||||
|
f.write(validation_content)
|
||||||
|
|
||||||
|
# Ensure absolute path for MCP server compatibility
|
||||||
|
validation_file = os.path.abspath(validation_file)
|
||||||
|
|
||||||
|
# Test 1: Precommit tool with files parameter (where the bug occurred)
|
||||||
|
self.logger.info(" 1: Testing precommit tool content duplication")
|
||||||
|
|
||||||
|
# Call precommit tool with the validation file
|
||||||
|
response1, thread_id = self.call_mcp_tool(
|
||||||
|
"precommit",
|
||||||
|
{
|
||||||
|
"path": os.getcwd(),
|
||||||
|
"files": [validation_file],
|
||||||
|
"original_request": "Test for content duplication in precommit tool",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if response1:
|
||||||
|
# Parse response and check for content duplication
|
||||||
|
try:
|
||||||
|
response_data = json.loads(response1)
|
||||||
|
content = response_data.get("content", "")
|
||||||
|
|
||||||
|
# Count occurrences of distinctive markers
|
||||||
|
max_content_count = content.count("MAX_CONTENT_TOKENS = 800_000")
|
||||||
|
temp_analytical_count = content.count("TEMPERATURE_ANALYTICAL = 0.2")
|
||||||
|
unique_marker_count = content.count("UNIQUE_VALIDATION_MARKER")
|
||||||
|
|
||||||
|
# Validate no duplication
|
||||||
|
duplication_detected = False
|
||||||
|
issues = []
|
||||||
|
|
||||||
|
if max_content_count > 1:
|
||||||
|
issues.append(f"MAX_CONTENT_TOKENS appears {max_content_count} times")
|
||||||
|
duplication_detected = True
|
||||||
|
|
||||||
|
if temp_analytical_count > 1:
|
||||||
|
issues.append(f"TEMPERATURE_ANALYTICAL appears {temp_analytical_count} times")
|
||||||
|
duplication_detected = True
|
||||||
|
|
||||||
|
if unique_marker_count > 1:
|
||||||
|
issues.append(f"UNIQUE_VALIDATION_MARKER appears {unique_marker_count} times")
|
||||||
|
duplication_detected = True
|
||||||
|
|
||||||
|
if duplication_detected:
|
||||||
|
self.logger.error(f" ❌ Content duplication detected in precommit tool: {'; '.join(issues)}")
|
||||||
|
return False
|
||||||
|
else:
|
||||||
|
self.logger.info(" ✅ No content duplication in precommit tool")
|
||||||
|
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
self.logger.warning(" ⚠️ Could not parse precommit response as JSON")
|
||||||
|
|
||||||
|
else:
|
||||||
|
self.logger.warning(" ⚠️ Precommit tool failed to respond")
|
||||||
|
|
||||||
|
# Test 2: Other tools that use files parameter
|
||||||
|
tools_to_test = [
|
||||||
|
(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Analyze this config file",
|
||||||
|
"files": [validation_file],
|
||||||
|
}, # Using absolute path
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"codereview",
|
||||||
|
{
|
||||||
|
"files": [validation_file],
|
||||||
|
"context": "Please use low thinking mode. Review this configuration",
|
||||||
|
}, # Using absolute path
|
||||||
|
),
|
||||||
|
("analyze", {"files": [validation_file], "analysis_type": "code_quality"}), # Using absolute path
|
||||||
|
]
|
||||||
|
|
||||||
|
for tool_name, params in tools_to_test:
|
||||||
|
self.logger.info(f" 2.{tool_name}: Testing {tool_name} tool content duplication")
|
||||||
|
|
||||||
|
response, _ = self.call_mcp_tool(tool_name, params)
|
||||||
|
if response:
|
||||||
|
try:
|
||||||
|
response_data = json.loads(response)
|
||||||
|
content = response_data.get("content", "")
|
||||||
|
|
||||||
|
# Check for duplication
|
||||||
|
marker_count = content.count("UNIQUE_VALIDATION_MARKER")
|
||||||
|
if marker_count > 1:
|
||||||
|
self.logger.error(
|
||||||
|
f" ❌ Content duplication in {tool_name}: marker appears {marker_count} times"
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
else:
|
||||||
|
self.logger.info(f" ✅ No content duplication in {tool_name}")
|
||||||
|
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
self.logger.warning(f" ⚠️ Could not parse {tool_name} response")
|
||||||
|
else:
|
||||||
|
self.logger.warning(f" ⚠️ {tool_name} tool failed to respond")
|
||||||
|
|
||||||
|
# Test 3: Cross-tool content validation with file deduplication
|
||||||
|
self.logger.info(" 3: Testing cross-tool content consistency")
|
||||||
|
|
||||||
|
if thread_id:
|
||||||
|
# Continue conversation with same file - content should be deduplicated in conversation history
|
||||||
|
response2, _ = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Continue analyzing this configuration file",
|
||||||
|
"files": [validation_file], # Same file should be deduplicated
|
||||||
|
"continuation_id": thread_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if response2:
|
||||||
|
try:
|
||||||
|
response_data = json.loads(response2)
|
||||||
|
content = response_data.get("content", "")
|
||||||
|
|
||||||
|
# In continuation, the file content shouldn't be duplicated either
|
||||||
|
marker_count = content.count("UNIQUE_VALIDATION_MARKER")
|
||||||
|
if marker_count > 1:
|
||||||
|
self.logger.error(
|
||||||
|
f" ❌ Content duplication in cross-tool continuation: marker appears {marker_count} times"
|
||||||
|
)
|
||||||
|
return False
|
||||||
|
else:
|
||||||
|
self.logger.info(" ✅ No content duplication in cross-tool continuation")
|
||||||
|
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
self.logger.warning(" ⚠️ Could not parse continuation response")
|
||||||
|
|
||||||
|
# Cleanup
|
||||||
|
os.remove(validation_file)
|
||||||
|
|
||||||
|
self.logger.info(" ✅ All content validation tests passed")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Content validation test failed: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
self.cleanup_test_files()
|
||||||
306
simulator_tests/test_cross_tool_comprehensive.py
Normal file
306
simulator_tests/test_cross_tool_comprehensive.py
Normal file
@@ -0,0 +1,306 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Comprehensive Cross-Tool Test
|
||||||
|
|
||||||
|
Tests file deduplication, conversation continuation, and file handling
|
||||||
|
across all available MCP tools using realistic workflows with low thinking mode.
|
||||||
|
Validates:
|
||||||
|
1. Cross-tool conversation continuation
|
||||||
|
2. File deduplication across different tools
|
||||||
|
3. Mixed file scenarios (old + new files)
|
||||||
|
4. Conversation history preservation
|
||||||
|
5. Proper tool chaining with context
|
||||||
|
"""
|
||||||
|
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class CrossToolComprehensiveTest(BaseSimulatorTest):
|
||||||
|
"""Comprehensive test across all MCP tools"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "cross_tool_comprehensive"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Comprehensive cross-tool file deduplication and continuation"
|
||||||
|
|
||||||
|
def get_docker_logs_since(self, since_time: str) -> str:
|
||||||
|
"""Get docker logs since a specific timestamp"""
|
||||||
|
try:
|
||||||
|
# Check both main server and log monitor for comprehensive logs
|
||||||
|
cmd_server = ["docker", "logs", "--since", since_time, self.container_name]
|
||||||
|
cmd_monitor = ["docker", "logs", "--since", since_time, "gemini-mcp-log-monitor"]
|
||||||
|
|
||||||
|
result_server = subprocess.run(cmd_server, capture_output=True, text=True)
|
||||||
|
result_monitor = subprocess.run(cmd_monitor, capture_output=True, text=True)
|
||||||
|
|
||||||
|
# Combine logs from both containers
|
||||||
|
combined_logs = result_server.stdout + "\n" + result_monitor.stdout
|
||||||
|
return combined_logs
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Failed to get docker logs: {e}")
|
||||||
|
return ""
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Comprehensive cross-tool test with all MCP tools"""
|
||||||
|
try:
|
||||||
|
self.logger.info("📄 Test: Comprehensive cross-tool file deduplication and continuation")
|
||||||
|
|
||||||
|
# Setup test files
|
||||||
|
self.setup_test_files()
|
||||||
|
|
||||||
|
# Create short test files for quick testing
|
||||||
|
python_code = """def login(user, pwd):
|
||||||
|
# Security issue: plain text password
|
||||||
|
if user == "admin" and pwd == "123":
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
def hash_pwd(pwd):
|
||||||
|
# Weak hashing
|
||||||
|
return str(hash(pwd))
|
||||||
|
"""
|
||||||
|
|
||||||
|
config_file = """{
|
||||||
|
"db_password": "weak123",
|
||||||
|
"debug": true,
|
||||||
|
"secret_key": "test"
|
||||||
|
}"""
|
||||||
|
|
||||||
|
auth_file = self.create_additional_test_file("auth.py", python_code)
|
||||||
|
config_file_path = self.create_additional_test_file("config.json", config_file)
|
||||||
|
|
||||||
|
# Get timestamp for log filtering
|
||||||
|
import datetime
|
||||||
|
|
||||||
|
start_time = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S")
|
||||||
|
|
||||||
|
# Tool chain: chat → analyze → debug → codereview → precommit
|
||||||
|
# Each step builds on the previous with cross-tool continuation
|
||||||
|
|
||||||
|
current_continuation_id = None
|
||||||
|
responses = []
|
||||||
|
|
||||||
|
# Step 1: Start with chat tool to understand the codebase
|
||||||
|
self.logger.info(" Step 1: chat tool - Initial codebase exploration")
|
||||||
|
chat_params = {
|
||||||
|
"prompt": "Please give me a quick one line reply. I have an authentication module that needs review. Can you help me understand potential issues?",
|
||||||
|
"files": [auth_file],
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response1, continuation_id1 = self.call_mcp_tool("chat", chat_params)
|
||||||
|
if not response1 or not continuation_id1:
|
||||||
|
self.logger.error(" ❌ Step 1: chat tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(f" ✅ Step 1: chat completed with continuation_id: {continuation_id1[:8]}...")
|
||||||
|
responses.append(("chat", response1, continuation_id1))
|
||||||
|
current_continuation_id = continuation_id1
|
||||||
|
|
||||||
|
# Step 2: Use analyze tool to do deeper analysis (fresh conversation)
|
||||||
|
self.logger.info(" Step 2: analyze tool - Deep code analysis (fresh)")
|
||||||
|
analyze_params = {
|
||||||
|
"files": [auth_file],
|
||||||
|
"question": "Please give me a quick one line reply. What are the security vulnerabilities and architectural issues in this authentication code?",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response2, continuation_id2 = self.call_mcp_tool("analyze", analyze_params)
|
||||||
|
if not response2:
|
||||||
|
self.logger.error(" ❌ Step 2: analyze tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f" ✅ Step 2: analyze completed with continuation_id: {continuation_id2[:8] if continuation_id2 else 'None'}..."
|
||||||
|
)
|
||||||
|
responses.append(("analyze", response2, continuation_id2))
|
||||||
|
|
||||||
|
# Step 3: Continue chat conversation with config file
|
||||||
|
self.logger.info(" Step 3: chat continuation - Add config file context")
|
||||||
|
chat_continue_params = {
|
||||||
|
"continuation_id": current_continuation_id,
|
||||||
|
"prompt": "Please give me a quick one line reply. I also have this configuration file. Can you analyze it alongside the authentication code?",
|
||||||
|
"files": [auth_file, config_file_path], # Old + new file
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response3, _ = self.call_mcp_tool("chat", chat_continue_params)
|
||||||
|
if not response3:
|
||||||
|
self.logger.error(" ❌ Step 3: chat continuation failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ Step 3: chat continuation completed")
|
||||||
|
responses.append(("chat_continue", response3, current_continuation_id))
|
||||||
|
|
||||||
|
# Step 4: Use debug tool to identify specific issues
|
||||||
|
self.logger.info(" Step 4: debug tool - Identify specific problems")
|
||||||
|
debug_params = {
|
||||||
|
"files": [auth_file, config_file_path],
|
||||||
|
"error_description": "Please give me a quick one line reply. The authentication system has security vulnerabilities. Help me identify and fix the main issues.",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response4, continuation_id4 = self.call_mcp_tool("debug", debug_params)
|
||||||
|
if not response4:
|
||||||
|
self.logger.error(" ❌ Step 4: debug tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f" ✅ Step 4: debug completed with continuation_id: {continuation_id4[:8] if continuation_id4 else 'None'}..."
|
||||||
|
)
|
||||||
|
responses.append(("debug", response4, continuation_id4))
|
||||||
|
|
||||||
|
# Step 5: Cross-tool continuation - continue debug with chat context
|
||||||
|
if continuation_id4:
|
||||||
|
self.logger.info(" Step 5: debug continuation - Additional analysis")
|
||||||
|
debug_continue_params = {
|
||||||
|
"continuation_id": continuation_id4,
|
||||||
|
"files": [auth_file, config_file_path],
|
||||||
|
"error_description": "Please give me a quick one line reply. What specific code changes would you recommend to fix the password hashing vulnerability?",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response5, _ = self.call_mcp_tool("debug", debug_continue_params)
|
||||||
|
if response5:
|
||||||
|
self.logger.info(" ✅ Step 5: debug continuation completed")
|
||||||
|
responses.append(("debug_continue", response5, continuation_id4))
|
||||||
|
|
||||||
|
# Step 6: Use codereview for comprehensive review
|
||||||
|
self.logger.info(" Step 6: codereview tool - Comprehensive code review")
|
||||||
|
codereview_params = {
|
||||||
|
"files": [auth_file, config_file_path],
|
||||||
|
"context": "Please give me a quick one line reply. Comprehensive security-focused code review for production readiness",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response6, continuation_id6 = self.call_mcp_tool("codereview", codereview_params)
|
||||||
|
if not response6:
|
||||||
|
self.logger.error(" ❌ Step 6: codereview tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f" ✅ Step 6: codereview completed with continuation_id: {continuation_id6[:8] if continuation_id6 else 'None'}..."
|
||||||
|
)
|
||||||
|
responses.append(("codereview", response6, continuation_id6))
|
||||||
|
|
||||||
|
# Step 7: Create improved version and use precommit
|
||||||
|
self.logger.info(" Step 7: precommit tool - Pre-commit validation")
|
||||||
|
|
||||||
|
# Create a short improved version
|
||||||
|
improved_code = """import hashlib
|
||||||
|
|
||||||
|
def secure_login(user, pwd):
|
||||||
|
# Better: hashed password check
|
||||||
|
hashed = hashlib.sha256(pwd.encode()).hexdigest()
|
||||||
|
if user == "admin" and hashed == "expected_hash":
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
"""
|
||||||
|
|
||||||
|
improved_file = self.create_additional_test_file("auth_improved.py", improved_code)
|
||||||
|
|
||||||
|
precommit_params = {
|
||||||
|
"path": self.test_dir,
|
||||||
|
"files": [auth_file, config_file_path, improved_file],
|
||||||
|
"original_request": "Please give me a quick one line reply. Ready to commit security improvements to authentication module",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response7, continuation_id7 = self.call_mcp_tool("precommit", precommit_params)
|
||||||
|
if not response7:
|
||||||
|
self.logger.error(" ❌ Step 7: precommit tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f" ✅ Step 7: precommit completed with continuation_id: {continuation_id7[:8] if continuation_id7 else 'None'}..."
|
||||||
|
)
|
||||||
|
responses.append(("precommit", response7, continuation_id7))
|
||||||
|
|
||||||
|
# Validate comprehensive results
|
||||||
|
self.logger.info(" 📋 Validating comprehensive cross-tool results...")
|
||||||
|
logs = self.get_docker_logs_since(start_time)
|
||||||
|
|
||||||
|
# Validation criteria
|
||||||
|
tools_used = [r[0] for r in responses]
|
||||||
|
continuation_ids_created = [r[2] for r in responses if r[2]]
|
||||||
|
|
||||||
|
# Check for various log patterns
|
||||||
|
conversation_logs = [
|
||||||
|
line for line in logs.split("\n") if "conversation" in line.lower() or "history" in line.lower()
|
||||||
|
]
|
||||||
|
embedding_logs = [
|
||||||
|
line
|
||||||
|
for line in logs.split("\n")
|
||||||
|
if "📁" in line or "embedding" in line.lower() or "file" in line.lower()
|
||||||
|
]
|
||||||
|
continuation_logs = [
|
||||||
|
line for line in logs.split("\n") if "continuation" in line.lower() or "resuming" in line.lower()
|
||||||
|
]
|
||||||
|
cross_tool_logs = [
|
||||||
|
line
|
||||||
|
for line in logs.split("\n")
|
||||||
|
if any(tool in line.lower() for tool in ["chat", "analyze", "debug", "codereview", "precommit"])
|
||||||
|
]
|
||||||
|
|
||||||
|
# File mentions
|
||||||
|
auth_file_mentioned = any("auth.py" in line for line in logs.split("\n"))
|
||||||
|
config_file_mentioned = any("config.json" in line for line in logs.split("\n"))
|
||||||
|
improved_file_mentioned = any("auth_improved.py" in line for line in logs.split("\n"))
|
||||||
|
|
||||||
|
# Print comprehensive diagnostics
|
||||||
|
self.logger.info(f" 📊 Tools used: {len(tools_used)} ({', '.join(tools_used)})")
|
||||||
|
self.logger.info(f" 📊 Continuation IDs created: {len(continuation_ids_created)}")
|
||||||
|
self.logger.info(f" 📊 Conversation logs found: {len(conversation_logs)}")
|
||||||
|
self.logger.info(f" 📊 File embedding logs found: {len(embedding_logs)}")
|
||||||
|
self.logger.info(f" 📊 Continuation logs found: {len(continuation_logs)}")
|
||||||
|
self.logger.info(f" 📊 Cross-tool activity logs: {len(cross_tool_logs)}")
|
||||||
|
self.logger.info(f" 📊 Auth file mentioned: {auth_file_mentioned}")
|
||||||
|
self.logger.info(f" 📊 Config file mentioned: {config_file_mentioned}")
|
||||||
|
self.logger.info(f" 📊 Improved file mentioned: {improved_file_mentioned}")
|
||||||
|
|
||||||
|
if self.verbose:
|
||||||
|
self.logger.debug(" 📋 Sample tool activity logs:")
|
||||||
|
for log in cross_tool_logs[:10]: # Show first 10
|
||||||
|
if log.strip():
|
||||||
|
self.logger.debug(f" {log.strip()}")
|
||||||
|
|
||||||
|
self.logger.debug(" 📋 Sample continuation logs:")
|
||||||
|
for log in continuation_logs[:5]: # Show first 5
|
||||||
|
if log.strip():
|
||||||
|
self.logger.debug(f" {log.strip()}")
|
||||||
|
|
||||||
|
# Comprehensive success criteria
|
||||||
|
success_criteria = [
|
||||||
|
len(tools_used) >= 5, # Used multiple tools
|
||||||
|
len(continuation_ids_created) >= 3, # Created multiple continuation threads
|
||||||
|
len(embedding_logs) > 10, # Significant file embedding activity
|
||||||
|
len(continuation_logs) > 0, # Evidence of continuation
|
||||||
|
auth_file_mentioned, # Original file processed
|
||||||
|
config_file_mentioned, # Additional file processed
|
||||||
|
improved_file_mentioned, # New file processed
|
||||||
|
len(conversation_logs) > 5, # Conversation history activity
|
||||||
|
]
|
||||||
|
|
||||||
|
passed_criteria = sum(success_criteria)
|
||||||
|
total_criteria = len(success_criteria)
|
||||||
|
|
||||||
|
self.logger.info(f" 📊 Success criteria met: {passed_criteria}/{total_criteria}")
|
||||||
|
|
||||||
|
if passed_criteria >= 6: # At least 6 out of 8 criteria
|
||||||
|
self.logger.info(" ✅ Comprehensive cross-tool test: PASSED")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
self.logger.warning(" ⚠️ Comprehensive cross-tool test: FAILED")
|
||||||
|
self.logger.warning(" 💡 Check logs for detailed cross-tool activity")
|
||||||
|
return False
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Comprehensive cross-tool test failed: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
self.cleanup_test_files()
|
||||||
198
simulator_tests/test_cross_tool_continuation.py
Normal file
198
simulator_tests/test_cross_tool_continuation.py
Normal file
@@ -0,0 +1,198 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Cross-Tool Continuation Test
|
||||||
|
|
||||||
|
Tests comprehensive cross-tool continuation scenarios to ensure
|
||||||
|
conversation context is maintained when switching between different tools.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class CrossToolContinuationTest(BaseSimulatorTest):
|
||||||
|
"""Test comprehensive cross-tool continuation scenarios"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "cross_tool_continuation"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Cross-tool conversation continuation scenarios"
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Test comprehensive cross-tool continuation scenarios"""
|
||||||
|
try:
|
||||||
|
self.logger.info("🔧 Test: Cross-tool continuation scenarios")
|
||||||
|
|
||||||
|
# Setup test files
|
||||||
|
self.setup_test_files()
|
||||||
|
|
||||||
|
success_count = 0
|
||||||
|
total_scenarios = 3
|
||||||
|
|
||||||
|
# Scenario 1: chat -> thinkdeep -> codereview
|
||||||
|
if self._test_chat_thinkdeep_codereview():
|
||||||
|
success_count += 1
|
||||||
|
|
||||||
|
# Scenario 2: analyze -> debug -> thinkdeep
|
||||||
|
if self._test_analyze_debug_thinkdeep():
|
||||||
|
success_count += 1
|
||||||
|
|
||||||
|
# Scenario 3: Multi-file cross-tool continuation
|
||||||
|
if self._test_multi_file_continuation():
|
||||||
|
success_count += 1
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f" ✅ Cross-tool continuation scenarios completed: {success_count}/{total_scenarios} scenarios passed"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Consider successful if at least one scenario worked
|
||||||
|
return success_count > 0
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Cross-tool continuation test failed: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
self.cleanup_test_files()
|
||||||
|
|
||||||
|
def _test_chat_thinkdeep_codereview(self) -> bool:
|
||||||
|
"""Test chat -> thinkdeep -> codereview scenario"""
|
||||||
|
try:
|
||||||
|
self.logger.info(" 1: Testing chat -> thinkdeep -> codereview")
|
||||||
|
|
||||||
|
# Start with chat
|
||||||
|
chat_response, chat_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Look at this Python code and tell me what you think about it",
|
||||||
|
"files": [self.test_files["python"]],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not chat_response or not chat_id:
|
||||||
|
self.logger.error("Failed to start chat conversation")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Continue with thinkdeep
|
||||||
|
thinkdeep_response, _ = self.call_mcp_tool(
|
||||||
|
"thinkdeep",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Think deeply about potential performance issues in this code",
|
||||||
|
"files": [self.test_files["python"]], # Same file should be deduplicated
|
||||||
|
"continuation_id": chat_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not thinkdeep_response:
|
||||||
|
self.logger.error("Failed chat -> thinkdeep continuation")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Continue with codereview
|
||||||
|
codereview_response, _ = self.call_mcp_tool(
|
||||||
|
"codereview",
|
||||||
|
{
|
||||||
|
"files": [self.test_files["python"]], # Same file should be deduplicated
|
||||||
|
"context": "Building on our previous analysis, provide a comprehensive code review",
|
||||||
|
"continuation_id": chat_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not codereview_response:
|
||||||
|
self.logger.error("Failed thinkdeep -> codereview continuation")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ chat -> thinkdeep -> codereview working")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Chat -> thinkdeep -> codereview scenario failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _test_analyze_debug_thinkdeep(self) -> bool:
|
||||||
|
"""Test analyze -> debug -> thinkdeep scenario"""
|
||||||
|
try:
|
||||||
|
self.logger.info(" 2: Testing analyze -> debug -> thinkdeep")
|
||||||
|
|
||||||
|
# Start with analyze
|
||||||
|
analyze_response, analyze_id = self.call_mcp_tool(
|
||||||
|
"analyze", {"files": [self.test_files["python"]], "analysis_type": "code_quality"}
|
||||||
|
)
|
||||||
|
|
||||||
|
if not analyze_response or not analyze_id:
|
||||||
|
self.logger.warning("Failed to start analyze conversation, skipping scenario 2")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Continue with debug
|
||||||
|
debug_response, _ = self.call_mcp_tool(
|
||||||
|
"debug",
|
||||||
|
{
|
||||||
|
"files": [self.test_files["python"]], # Same file should be deduplicated
|
||||||
|
"issue_description": "Based on our analysis, help debug the performance issue in fibonacci",
|
||||||
|
"continuation_id": analyze_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not debug_response:
|
||||||
|
self.logger.warning(" ⚠️ analyze -> debug continuation failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Continue with thinkdeep
|
||||||
|
final_response, _ = self.call_mcp_tool(
|
||||||
|
"thinkdeep",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Think deeply about the architectural implications of the issues we've found",
|
||||||
|
"files": [self.test_files["python"]], # Same file should be deduplicated
|
||||||
|
"continuation_id": analyze_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not final_response:
|
||||||
|
self.logger.warning(" ⚠️ debug -> thinkdeep continuation failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ analyze -> debug -> thinkdeep working")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Analyze -> debug -> thinkdeep scenario failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _test_multi_file_continuation(self) -> bool:
|
||||||
|
"""Test multi-file cross-tool continuation"""
|
||||||
|
try:
|
||||||
|
self.logger.info(" 3: Testing multi-file cross-tool continuation")
|
||||||
|
|
||||||
|
# Start with both files
|
||||||
|
multi_response, multi_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Please use low thinking mode. Analyze both the Python code and configuration file",
|
||||||
|
"files": [self.test_files["python"], self.test_files["config"]],
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not multi_response or not multi_id:
|
||||||
|
self.logger.warning("Failed to start multi-file conversation, skipping scenario 3")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Switch to codereview with same files (should use conversation history)
|
||||||
|
multi_review, _ = self.call_mcp_tool(
|
||||||
|
"codereview",
|
||||||
|
{
|
||||||
|
"files": [self.test_files["python"], self.test_files["config"]], # Same files
|
||||||
|
"context": "Review both files in the context of our previous discussion",
|
||||||
|
"continuation_id": multi_id,
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not multi_review:
|
||||||
|
self.logger.warning(" ⚠️ Multi-file cross-tool continuation failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ Multi-file cross-tool continuation working")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Multi-file continuation scenario failed: {e}")
|
||||||
|
return False
|
||||||
105
simulator_tests/test_logs_validation.py
Normal file
105
simulator_tests/test_logs_validation.py
Normal file
@@ -0,0 +1,105 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Docker Logs Validation Test
|
||||||
|
|
||||||
|
Validates Docker logs to confirm file deduplication behavior and
|
||||||
|
conversation threading is working properly.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class LogsValidationTest(BaseSimulatorTest):
|
||||||
|
"""Validate Docker logs to confirm file deduplication behavior"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "logs_validation"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Docker logs validation"
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Validate Docker logs to confirm file deduplication behavior"""
|
||||||
|
try:
|
||||||
|
self.logger.info("📋 Test: Validating Docker logs for file deduplication...")
|
||||||
|
|
||||||
|
# Get server logs from main container
|
||||||
|
result = self.run_command(["docker", "logs", self.container_name], capture_output=True)
|
||||||
|
|
||||||
|
if result.returncode != 0:
|
||||||
|
self.logger.error(f"Failed to get Docker logs: {result.stderr}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
main_logs = result.stdout.decode() + result.stderr.decode()
|
||||||
|
|
||||||
|
# Get logs from log monitor container (where detailed activity is logged)
|
||||||
|
monitor_result = self.run_command(["docker", "logs", "gemini-mcp-log-monitor"], capture_output=True)
|
||||||
|
monitor_logs = ""
|
||||||
|
if monitor_result.returncode == 0:
|
||||||
|
monitor_logs = monitor_result.stdout.decode() + monitor_result.stderr.decode()
|
||||||
|
|
||||||
|
# Also get activity logs for more detailed conversation tracking
|
||||||
|
activity_result = self.run_command(
|
||||||
|
["docker", "exec", self.container_name, "cat", "/tmp/mcp_activity.log"], capture_output=True
|
||||||
|
)
|
||||||
|
|
||||||
|
activity_logs = ""
|
||||||
|
if activity_result.returncode == 0:
|
||||||
|
activity_logs = activity_result.stdout.decode()
|
||||||
|
|
||||||
|
logs = main_logs + "\n" + monitor_logs + "\n" + activity_logs
|
||||||
|
|
||||||
|
# Look for conversation threading patterns that indicate the system is working
|
||||||
|
conversation_patterns = [
|
||||||
|
"CONVERSATION_RESUME",
|
||||||
|
"CONVERSATION_CONTEXT",
|
||||||
|
"previous turns loaded",
|
||||||
|
"tool embedding",
|
||||||
|
"files included",
|
||||||
|
"files truncated",
|
||||||
|
"already in conversation history",
|
||||||
|
]
|
||||||
|
|
||||||
|
conversation_lines = []
|
||||||
|
for line in logs.split("\n"):
|
||||||
|
for pattern in conversation_patterns:
|
||||||
|
if pattern.lower() in line.lower():
|
||||||
|
conversation_lines.append(line.strip())
|
||||||
|
break
|
||||||
|
|
||||||
|
# Look for evidence of conversation threading and file handling
|
||||||
|
conversation_threading_found = False
|
||||||
|
multi_turn_conversations = False
|
||||||
|
|
||||||
|
for line in conversation_lines:
|
||||||
|
lower_line = line.lower()
|
||||||
|
if "conversation_resume" in lower_line:
|
||||||
|
conversation_threading_found = True
|
||||||
|
self.logger.debug(f"📄 Conversation threading: {line}")
|
||||||
|
elif "previous turns loaded" in lower_line:
|
||||||
|
multi_turn_conversations = True
|
||||||
|
self.logger.debug(f"📄 Multi-turn conversation: {line}")
|
||||||
|
elif "already in conversation" in lower_line:
|
||||||
|
self.logger.info(f"✅ Found explicit deduplication: {line}")
|
||||||
|
return True
|
||||||
|
|
||||||
|
# Conversation threading with multiple turns is evidence of file deduplication working
|
||||||
|
if conversation_threading_found and multi_turn_conversations:
|
||||||
|
self.logger.info("✅ Conversation threading with multi-turn context working")
|
||||||
|
self.logger.info(
|
||||||
|
"✅ File deduplication working implicitly (files embedded once in conversation history)"
|
||||||
|
)
|
||||||
|
return True
|
||||||
|
elif conversation_threading_found:
|
||||||
|
self.logger.info("✅ Conversation threading detected")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
self.logger.warning("⚠️ No clear evidence of conversation threading in logs")
|
||||||
|
self.logger.debug(f"Found {len(conversation_lines)} conversation-related log lines")
|
||||||
|
return False
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Log validation failed: {e}")
|
||||||
|
return False
|
||||||
177
simulator_tests/test_model_thinking_config.py
Normal file
177
simulator_tests/test_model_thinking_config.py
Normal file
@@ -0,0 +1,177 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Model Thinking Configuration Test
|
||||||
|
|
||||||
|
Tests that thinking configuration is properly applied only to models that support it,
|
||||||
|
and that Flash models work correctly without thinking config.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class TestModelThinkingConfig(BaseSimulatorTest):
|
||||||
|
"""Test model-specific thinking configuration behavior"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "model_thinking_config"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Model-specific thinking configuration behavior"
|
||||||
|
|
||||||
|
def test_pro_model_with_thinking_config(self):
|
||||||
|
"""Test that Pro model uses thinking configuration"""
|
||||||
|
self.logger.info("Testing Pro model with thinking configuration...")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Test with explicit pro model and high thinking mode
|
||||||
|
response, continuation_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "What is 2 + 2? Please think carefully and explain.",
|
||||||
|
"model": "pro", # Should resolve to gemini-2.5-pro-preview-06-05
|
||||||
|
"thinking_mode": "high", # Should use thinking_config
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response:
|
||||||
|
raise Exception("Pro model test failed: No response received")
|
||||||
|
|
||||||
|
self.logger.info("✅ Pro model with thinking config works correctly")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"❌ Pro model test failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def test_flash_model_without_thinking_config(self):
|
||||||
|
"""Test that Flash model works without thinking configuration"""
|
||||||
|
self.logger.info("Testing Flash model without thinking configuration...")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Test with explicit flash model and thinking mode (should be ignored)
|
||||||
|
response, continuation_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "What is 3 + 3? Give a quick answer.",
|
||||||
|
"model": "flash", # Should resolve to gemini-2.0-flash-exp
|
||||||
|
"thinking_mode": "high", # Should be ignored for Flash model
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response:
|
||||||
|
raise Exception("Flash model test failed: No response received")
|
||||||
|
|
||||||
|
self.logger.info("✅ Flash model without thinking config works correctly")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
if "thinking" in str(e).lower() and ("not supported" in str(e).lower() or "invalid" in str(e).lower()):
|
||||||
|
raise Exception(f"Flash model incorrectly tried to use thinking config: {e}")
|
||||||
|
self.logger.error(f"❌ Flash model test failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def test_model_resolution_logic(self):
|
||||||
|
"""Test that model resolution works correctly for both shortcuts and full names"""
|
||||||
|
self.logger.info("Testing model resolution logic...")
|
||||||
|
|
||||||
|
test_cases = [
|
||||||
|
("pro", "should work with Pro model"),
|
||||||
|
("flash", "should work with Flash model"),
|
||||||
|
("gemini-2.5-pro-preview-06-05", "should work with full Pro model name"),
|
||||||
|
("gemini-2.0-flash-exp", "should work with full Flash model name"),
|
||||||
|
]
|
||||||
|
|
||||||
|
success_count = 0
|
||||||
|
|
||||||
|
for model_name, description in test_cases:
|
||||||
|
try:
|
||||||
|
response, continuation_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": f"Test with {model_name}: What is 1 + 1?",
|
||||||
|
"model": model_name,
|
||||||
|
"thinking_mode": "medium",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response:
|
||||||
|
raise Exception(f"No response received for model {model_name}")
|
||||||
|
|
||||||
|
self.logger.info(f"✅ {model_name} {description}")
|
||||||
|
success_count += 1
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"❌ {model_name} failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
return success_count == len(test_cases)
|
||||||
|
|
||||||
|
def test_default_model_behavior(self):
|
||||||
|
"""Test behavior with server default model (no explicit model specified)"""
|
||||||
|
self.logger.info("Testing default model behavior...")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Test without specifying model (should use server default)
|
||||||
|
response, continuation_id = self.call_mcp_tool(
|
||||||
|
"chat",
|
||||||
|
{
|
||||||
|
"prompt": "Test default model: What is 4 + 4?",
|
||||||
|
# No model specified - should use DEFAULT_MODEL from config
|
||||||
|
"thinking_mode": "medium",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
|
||||||
|
if not response:
|
||||||
|
raise Exception("Default model test failed: No response received")
|
||||||
|
|
||||||
|
self.logger.info("✅ Default model behavior works correctly")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"❌ Default model test failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Run all model thinking configuration tests"""
|
||||||
|
self.logger.info(f"📝 Test: {self.test_description}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Test Pro model with thinking config
|
||||||
|
if not self.test_pro_model_with_thinking_config():
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Test Flash model without thinking config
|
||||||
|
if not self.test_flash_model_without_thinking_config():
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Test model resolution logic
|
||||||
|
if not self.test_model_resolution_logic():
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Test default model behavior
|
||||||
|
if not self.test_default_model_behavior():
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(f"✅ All {self.test_name} tests passed!")
|
||||||
|
return True
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"❌ {self.test_name} test failed: {e}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Run the model thinking configuration tests"""
|
||||||
|
import sys
|
||||||
|
|
||||||
|
verbose = "--verbose" in sys.argv or "-v" in sys.argv
|
||||||
|
test = TestModelThinkingConfig(verbose=verbose)
|
||||||
|
|
||||||
|
success = test.run_test()
|
||||||
|
sys.exit(0 if success else 1)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
232
simulator_tests/test_per_tool_deduplication.py
Normal file
232
simulator_tests/test_per_tool_deduplication.py
Normal file
@@ -0,0 +1,232 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Per-Tool File Deduplication Test
|
||||||
|
|
||||||
|
Tests file deduplication for each individual MCP tool to ensure
|
||||||
|
that files are properly deduplicated within single-tool conversations.
|
||||||
|
Validates that:
|
||||||
|
1. Files are embedded only once in conversation history
|
||||||
|
2. Continuation calls don't re-read existing files
|
||||||
|
3. New files are still properly embedded
|
||||||
|
4. Docker logs show deduplication behavior
|
||||||
|
"""
|
||||||
|
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class PerToolDeduplicationTest(BaseSimulatorTest):
|
||||||
|
"""Test file deduplication for each individual tool"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "per_tool_deduplication"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "File deduplication for individual tools"
|
||||||
|
|
||||||
|
def get_docker_logs_since(self, since_time: str) -> str:
|
||||||
|
"""Get docker logs since a specific timestamp"""
|
||||||
|
try:
|
||||||
|
# Check both main server and log monitor for comprehensive logs
|
||||||
|
cmd_server = ["docker", "logs", "--since", since_time, self.container_name]
|
||||||
|
cmd_monitor = ["docker", "logs", "--since", since_time, "gemini-mcp-log-monitor"]
|
||||||
|
|
||||||
|
result_server = subprocess.run(cmd_server, capture_output=True, text=True)
|
||||||
|
result_monitor = subprocess.run(cmd_monitor, capture_output=True, text=True)
|
||||||
|
|
||||||
|
# Combine logs from both containers
|
||||||
|
combined_logs = result_server.stdout + "\n" + result_monitor.stdout
|
||||||
|
return combined_logs
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Failed to get docker logs: {e}")
|
||||||
|
return ""
|
||||||
|
|
||||||
|
# create_additional_test_file method now inherited from base class
|
||||||
|
|
||||||
|
def validate_file_deduplication_in_logs(self, logs: str, tool_name: str, test_file: str) -> bool:
|
||||||
|
"""Validate that logs show file deduplication behavior"""
|
||||||
|
# Look for file embedding messages
|
||||||
|
embedding_messages = [
|
||||||
|
line for line in logs.split("\n") if "📁" in line and "embedding" in line and tool_name in line
|
||||||
|
]
|
||||||
|
|
||||||
|
# Look for deduplication/filtering messages
|
||||||
|
filtering_messages = [
|
||||||
|
line for line in logs.split("\n") if "📁" in line and "Filtering" in line and tool_name in line
|
||||||
|
]
|
||||||
|
skipping_messages = [
|
||||||
|
line for line in logs.split("\n") if "📁" in line and "skipping" in line and tool_name in line
|
||||||
|
]
|
||||||
|
|
||||||
|
deduplication_found = len(filtering_messages) > 0 or len(skipping_messages) > 0
|
||||||
|
|
||||||
|
if deduplication_found:
|
||||||
|
self.logger.info(f" ✅ {tool_name}: Found deduplication evidence in logs")
|
||||||
|
for msg in filtering_messages + skipping_messages:
|
||||||
|
self.logger.debug(f" 📁 {msg.strip()}")
|
||||||
|
else:
|
||||||
|
self.logger.warning(f" ⚠️ {tool_name}: No deduplication evidence found in logs")
|
||||||
|
self.logger.debug(f" 📁 All embedding messages: {embedding_messages}")
|
||||||
|
|
||||||
|
return deduplication_found
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Test file deduplication with realistic precommit/codereview workflow"""
|
||||||
|
try:
|
||||||
|
self.logger.info("📄 Test: Simplified file deduplication with precommit/codereview workflow")
|
||||||
|
|
||||||
|
# Setup test files
|
||||||
|
self.setup_test_files()
|
||||||
|
|
||||||
|
# Create a short dummy file for quick testing
|
||||||
|
dummy_content = """def add(a, b):
|
||||||
|
return a + b # Missing type hints
|
||||||
|
|
||||||
|
def divide(x, y):
|
||||||
|
return x / y # No zero check
|
||||||
|
"""
|
||||||
|
dummy_file_path = self.create_additional_test_file("dummy_code.py", dummy_content)
|
||||||
|
|
||||||
|
# Get timestamp for log filtering
|
||||||
|
import datetime
|
||||||
|
|
||||||
|
start_time = datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S")
|
||||||
|
|
||||||
|
# Step 1: precommit tool with dummy file (low thinking mode)
|
||||||
|
self.logger.info(" Step 1: precommit tool with dummy file")
|
||||||
|
precommit_params = {
|
||||||
|
"path": self.test_dir, # Required path parameter
|
||||||
|
"files": [dummy_file_path],
|
||||||
|
"original_request": "Please give me a quick one line reply. Review this code for commit readiness",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response1, continuation_id = self.call_mcp_tool("precommit", precommit_params)
|
||||||
|
if not response1:
|
||||||
|
self.logger.error(" ❌ Step 1: precommit tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
if not continuation_id:
|
||||||
|
self.logger.error(" ❌ Step 1: precommit tool didn't provide continuation_id")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Validate continuation_id format (should be UUID)
|
||||||
|
if len(continuation_id) < 32:
|
||||||
|
self.logger.error(f" ❌ Step 1: Invalid continuation_id format: {continuation_id}")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(f" ✅ Step 1: precommit completed with continuation_id: {continuation_id[:8]}...")
|
||||||
|
|
||||||
|
# Step 2: codereview tool with same file (NO continuation - fresh conversation)
|
||||||
|
self.logger.info(" Step 2: codereview tool with same file (fresh conversation)")
|
||||||
|
codereview_params = {
|
||||||
|
"files": [dummy_file_path],
|
||||||
|
"context": "Please give me a quick one line reply. General code review for quality and best practices",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response2, _ = self.call_mcp_tool("codereview", codereview_params)
|
||||||
|
if not response2:
|
||||||
|
self.logger.error(" ❌ Step 2: codereview tool failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ Step 2: codereview completed (fresh conversation)")
|
||||||
|
|
||||||
|
# Step 3: Create new file and continue with precommit
|
||||||
|
self.logger.info(" Step 3: precommit continuation with old + new file")
|
||||||
|
new_file_content = """def multiply(x, y):
|
||||||
|
return x * y
|
||||||
|
|
||||||
|
def subtract(a, b):
|
||||||
|
return a - b
|
||||||
|
"""
|
||||||
|
new_file_path = self.create_additional_test_file("new_feature.py", new_file_content)
|
||||||
|
|
||||||
|
# Continue precommit with both files
|
||||||
|
continue_params = {
|
||||||
|
"continuation_id": continuation_id,
|
||||||
|
"path": self.test_dir, # Required path parameter
|
||||||
|
"files": [dummy_file_path, new_file_path], # Old + new file
|
||||||
|
"original_request": "Please give me a quick one line reply. Now also review the new feature file along with the previous one",
|
||||||
|
"thinking_mode": "low",
|
||||||
|
}
|
||||||
|
|
||||||
|
response3, _ = self.call_mcp_tool("precommit", continue_params)
|
||||||
|
if not response3:
|
||||||
|
self.logger.error(" ❌ Step 3: precommit continuation failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info(" ✅ Step 3: precommit continuation completed")
|
||||||
|
|
||||||
|
# Validate results in docker logs
|
||||||
|
self.logger.info(" 📋 Validating conversation history and file deduplication...")
|
||||||
|
logs = self.get_docker_logs_since(start_time)
|
||||||
|
|
||||||
|
# Check for conversation history building
|
||||||
|
conversation_logs = [
|
||||||
|
line for line in logs.split("\n") if "conversation" in line.lower() or "history" in line.lower()
|
||||||
|
]
|
||||||
|
|
||||||
|
# Check for file embedding/deduplication
|
||||||
|
embedding_logs = [
|
||||||
|
line
|
||||||
|
for line in logs.split("\n")
|
||||||
|
if "📁" in line or "embedding" in line.lower() or "file" in line.lower()
|
||||||
|
]
|
||||||
|
|
||||||
|
# Check for continuation evidence
|
||||||
|
continuation_logs = [
|
||||||
|
line for line in logs.split("\n") if "continuation" in line.lower() or continuation_id[:8] in line
|
||||||
|
]
|
||||||
|
|
||||||
|
# Check for both files mentioned
|
||||||
|
dummy_file_mentioned = any("dummy_code.py" in line for line in logs.split("\n"))
|
||||||
|
new_file_mentioned = any("new_feature.py" in line for line in logs.split("\n"))
|
||||||
|
|
||||||
|
# Print diagnostic information
|
||||||
|
self.logger.info(f" 📊 Conversation logs found: {len(conversation_logs)}")
|
||||||
|
self.logger.info(f" 📊 File embedding logs found: {len(embedding_logs)}")
|
||||||
|
self.logger.info(f" 📊 Continuation logs found: {len(continuation_logs)}")
|
||||||
|
self.logger.info(f" 📊 Dummy file mentioned: {dummy_file_mentioned}")
|
||||||
|
self.logger.info(f" 📊 New file mentioned: {new_file_mentioned}")
|
||||||
|
|
||||||
|
if self.verbose:
|
||||||
|
self.logger.debug(" 📋 Sample embedding logs:")
|
||||||
|
for log in embedding_logs[:5]: # Show first 5
|
||||||
|
if log.strip():
|
||||||
|
self.logger.debug(f" {log.strip()}")
|
||||||
|
|
||||||
|
self.logger.debug(" 📋 Sample continuation logs:")
|
||||||
|
for log in continuation_logs[:3]: # Show first 3
|
||||||
|
if log.strip():
|
||||||
|
self.logger.debug(f" {log.strip()}")
|
||||||
|
|
||||||
|
# Determine success criteria
|
||||||
|
success_criteria = [
|
||||||
|
len(embedding_logs) > 0, # File embedding occurred
|
||||||
|
len(continuation_logs) > 0, # Continuation worked
|
||||||
|
dummy_file_mentioned, # Original file processed
|
||||||
|
new_file_mentioned, # New file processed
|
||||||
|
]
|
||||||
|
|
||||||
|
passed_criteria = sum(success_criteria)
|
||||||
|
total_criteria = len(success_criteria)
|
||||||
|
|
||||||
|
self.logger.info(f" 📊 Success criteria met: {passed_criteria}/{total_criteria}")
|
||||||
|
|
||||||
|
if passed_criteria >= 3: # At least 3 out of 4 criteria
|
||||||
|
self.logger.info(" ✅ File deduplication workflow test: PASSED")
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
self.logger.warning(" ⚠️ File deduplication workflow test: FAILED")
|
||||||
|
self.logger.warning(" 💡 Check docker logs for detailed file embedding and continuation activity")
|
||||||
|
return False
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"File deduplication workflow test failed: {e}")
|
||||||
|
return False
|
||||||
|
finally:
|
||||||
|
self.cleanup_test_files()
|
||||||
139
simulator_tests/test_redis_validation.py
Normal file
139
simulator_tests/test_redis_validation.py
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Redis Conversation Memory Validation Test
|
||||||
|
|
||||||
|
Validates that conversation memory is working via Redis by checking
|
||||||
|
for stored conversation threads and their content.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
|
||||||
|
from .base_test import BaseSimulatorTest
|
||||||
|
|
||||||
|
|
||||||
|
class RedisValidationTest(BaseSimulatorTest):
|
||||||
|
"""Validate that conversation memory is working via Redis"""
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_name(self) -> str:
|
||||||
|
return "redis_validation"
|
||||||
|
|
||||||
|
@property
|
||||||
|
def test_description(self) -> str:
|
||||||
|
return "Redis conversation memory validation"
|
||||||
|
|
||||||
|
def run_test(self) -> bool:
|
||||||
|
"""Validate that conversation memory is working via Redis"""
|
||||||
|
try:
|
||||||
|
self.logger.info("💾 Test: Validating conversation memory via Redis...")
|
||||||
|
|
||||||
|
# First, test Redis connectivity
|
||||||
|
ping_result = self.run_command(
|
||||||
|
["docker", "exec", self.redis_container, "redis-cli", "ping"], capture_output=True
|
||||||
|
)
|
||||||
|
|
||||||
|
if ping_result.returncode != 0:
|
||||||
|
self.logger.error("Failed to connect to Redis")
|
||||||
|
return False
|
||||||
|
|
||||||
|
if "PONG" not in ping_result.stdout.decode():
|
||||||
|
self.logger.error("Redis ping failed")
|
||||||
|
return False
|
||||||
|
|
||||||
|
self.logger.info("✅ Redis connectivity confirmed")
|
||||||
|
|
||||||
|
# Check Redis for stored conversations
|
||||||
|
result = self.run_command(
|
||||||
|
["docker", "exec", self.redis_container, "redis-cli", "KEYS", "thread:*"], capture_output=True
|
||||||
|
)
|
||||||
|
|
||||||
|
if result.returncode != 0:
|
||||||
|
self.logger.error("Failed to query Redis")
|
||||||
|
return False
|
||||||
|
|
||||||
|
keys = result.stdout.decode().strip().split("\n")
|
||||||
|
thread_keys = [k for k in keys if k.startswith("thread:") and k != "thread:*"]
|
||||||
|
|
||||||
|
if thread_keys:
|
||||||
|
self.logger.info(f"✅ Found {len(thread_keys)} conversation threads in Redis")
|
||||||
|
|
||||||
|
# Get details of first thread
|
||||||
|
thread_key = thread_keys[0]
|
||||||
|
result = self.run_command(
|
||||||
|
["docker", "exec", self.redis_container, "redis-cli", "GET", thread_key], capture_output=True
|
||||||
|
)
|
||||||
|
|
||||||
|
if result.returncode == 0:
|
||||||
|
thread_data = result.stdout.decode()
|
||||||
|
try:
|
||||||
|
parsed = json.loads(thread_data)
|
||||||
|
turns = parsed.get("turns", [])
|
||||||
|
self.logger.info(f"✅ Thread has {len(turns)} turns")
|
||||||
|
return True
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
self.logger.warning("Could not parse thread data")
|
||||||
|
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
# If no existing threads, create a test thread to validate Redis functionality
|
||||||
|
self.logger.info("📝 No existing threads found, creating test thread to validate Redis...")
|
||||||
|
|
||||||
|
test_thread_id = "test_thread_validation"
|
||||||
|
test_data = {
|
||||||
|
"thread_id": test_thread_id,
|
||||||
|
"turns": [
|
||||||
|
{"tool": "chat", "timestamp": "2025-06-11T16:30:00Z", "prompt": "Test validation prompt"}
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
# Store test data
|
||||||
|
store_result = self.run_command(
|
||||||
|
[
|
||||||
|
"docker",
|
||||||
|
"exec",
|
||||||
|
self.redis_container,
|
||||||
|
"redis-cli",
|
||||||
|
"SET",
|
||||||
|
f"thread:{test_thread_id}",
|
||||||
|
json.dumps(test_data),
|
||||||
|
],
|
||||||
|
capture_output=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
if store_result.returncode != 0:
|
||||||
|
self.logger.error("Failed to store test data in Redis")
|
||||||
|
return False
|
||||||
|
|
||||||
|
# Retrieve test data
|
||||||
|
retrieve_result = self.run_command(
|
||||||
|
["docker", "exec", self.redis_container, "redis-cli", "GET", f"thread:{test_thread_id}"],
|
||||||
|
capture_output=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
if retrieve_result.returncode != 0:
|
||||||
|
self.logger.error("Failed to retrieve test data from Redis")
|
||||||
|
return False
|
||||||
|
|
||||||
|
retrieved_data = retrieve_result.stdout.decode()
|
||||||
|
try:
|
||||||
|
parsed = json.loads(retrieved_data)
|
||||||
|
if parsed.get("thread_id") == test_thread_id:
|
||||||
|
self.logger.info("✅ Redis read/write validation successful")
|
||||||
|
|
||||||
|
# Clean up test data
|
||||||
|
self.run_command(
|
||||||
|
["docker", "exec", self.redis_container, "redis-cli", "DEL", f"thread:{test_thread_id}"],
|
||||||
|
capture_output=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
return True
|
||||||
|
else:
|
||||||
|
self.logger.error("Retrieved data doesn't match stored data")
|
||||||
|
return False
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
self.logger.error("Could not parse retrieved test data")
|
||||||
|
return False
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
self.logger.error(f"Conversation memory validation failed: {e}")
|
||||||
|
return False
|
||||||
@@ -196,8 +196,8 @@ class TestClaudeContinuationOffers:
|
|||||||
assert response_data.get("continuation_offer") is None
|
assert response_data.get("continuation_offer") is None
|
||||||
|
|
||||||
@patch("utils.conversation_memory.get_redis_client")
|
@patch("utils.conversation_memory.get_redis_client")
|
||||||
async def test_threaded_conversation_no_continuation_offer(self, mock_redis):
|
async def test_threaded_conversation_with_continuation_offer(self, mock_redis):
|
||||||
"""Test that threaded conversations don't get continuation offers"""
|
"""Test that threaded conversations still get continuation offers when turns remain"""
|
||||||
mock_client = Mock()
|
mock_client = Mock()
|
||||||
mock_redis.return_value = mock_client
|
mock_redis.return_value = mock_client
|
||||||
|
|
||||||
@@ -234,9 +234,10 @@ class TestClaudeContinuationOffers:
|
|||||||
# Parse response
|
# Parse response
|
||||||
response_data = json.loads(response[0].text)
|
response_data = json.loads(response[0].text)
|
||||||
|
|
||||||
# Should be regular success, not continuation offer
|
# Should offer continuation since there are remaining turns (9 remaining: 10 max - 0 current - 1)
|
||||||
assert response_data["status"] == "success"
|
assert response_data["status"] == "continuation_available"
|
||||||
assert response_data.get("continuation_offer") is None
|
assert response_data.get("continuation_offer") is not None
|
||||||
|
assert response_data["continuation_offer"]["remaining_turns"] == 9
|
||||||
|
|
||||||
def test_max_turns_reached_no_continuation_offer(self):
|
def test_max_turns_reached_no_continuation_offer(self):
|
||||||
"""Test that no continuation is offered when max turns would be exceeded"""
|
"""Test that no continuation is offered when max turns would be exceeded"""
|
||||||
@@ -404,9 +405,11 @@ class TestContinuationIntegration:
|
|||||||
# Step 3: Claude uses continuation_id
|
# Step 3: Claude uses continuation_id
|
||||||
request2 = ToolRequest(prompt="Now analyze the performance aspects", continuation_id=thread_id)
|
request2 = ToolRequest(prompt="Now analyze the performance aspects", continuation_id=thread_id)
|
||||||
|
|
||||||
# This should NOT offer another continuation (already threaded)
|
# Should still offer continuation if there are remaining turns
|
||||||
continuation_data2 = self.tool._check_continuation_opportunity(request2)
|
continuation_data2 = self.tool._check_continuation_opportunity(request2)
|
||||||
assert continuation_data2 is None
|
assert continuation_data2 is not None
|
||||||
|
assert continuation_data2["remaining_turns"] == 8 # MAX_CONVERSATION_TURNS(10) - current_turns(1) - 1
|
||||||
|
assert continuation_data2["tool_name"] == "test_continuation"
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|||||||
@@ -3,7 +3,7 @@ Tests for configuration
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
from config import (
|
from config import (
|
||||||
GEMINI_MODEL,
|
DEFAULT_MODEL,
|
||||||
MAX_CONTEXT_TOKENS,
|
MAX_CONTEXT_TOKENS,
|
||||||
TEMPERATURE_ANALYTICAL,
|
TEMPERATURE_ANALYTICAL,
|
||||||
TEMPERATURE_BALANCED,
|
TEMPERATURE_BALANCED,
|
||||||
@@ -31,7 +31,7 @@ class TestConfig:
|
|||||||
|
|
||||||
def test_model_config(self):
|
def test_model_config(self):
|
||||||
"""Test model configuration"""
|
"""Test model configuration"""
|
||||||
assert GEMINI_MODEL == "gemini-2.5-pro-preview-06-05"
|
assert DEFAULT_MODEL == "gemini-2.5-pro-preview-06-05"
|
||||||
assert MAX_CONTEXT_TOKENS == 1_000_000
|
assert MAX_CONTEXT_TOKENS == 1_000_000
|
||||||
|
|
||||||
def test_temperature_defaults(self):
|
def test_temperature_defaults(self):
|
||||||
|
|||||||
@@ -166,7 +166,7 @@ class TestConversationMemory:
|
|||||||
initial_context={},
|
initial_context={},
|
||||||
)
|
)
|
||||||
|
|
||||||
history = build_conversation_history(context)
|
history, tokens = build_conversation_history(context)
|
||||||
|
|
||||||
# Test basic structure
|
# Test basic structure
|
||||||
assert "CONVERSATION HISTORY" in history
|
assert "CONVERSATION HISTORY" in history
|
||||||
@@ -207,8 +207,9 @@ class TestConversationMemory:
|
|||||||
initial_context={},
|
initial_context={},
|
||||||
)
|
)
|
||||||
|
|
||||||
history = build_conversation_history(context)
|
history, tokens = build_conversation_history(context)
|
||||||
assert history == ""
|
assert history == ""
|
||||||
|
assert tokens == 0
|
||||||
|
|
||||||
|
|
||||||
class TestConversationFlow:
|
class TestConversationFlow:
|
||||||
@@ -373,7 +374,7 @@ class TestConversationFlow:
|
|||||||
initial_context={},
|
initial_context={},
|
||||||
)
|
)
|
||||||
|
|
||||||
history = build_conversation_history(context)
|
history, tokens = build_conversation_history(context)
|
||||||
expected_turn_text = f"Turn {test_max}/{MAX_CONVERSATION_TURNS}"
|
expected_turn_text = f"Turn {test_max}/{MAX_CONVERSATION_TURNS}"
|
||||||
assert expected_turn_text in history
|
assert expected_turn_text in history
|
||||||
|
|
||||||
@@ -595,7 +596,7 @@ class TestConversationFlow:
|
|||||||
initial_context={"prompt": "Analyze this codebase", "files": ["/project/src/"]},
|
initial_context={"prompt": "Analyze this codebase", "files": ["/project/src/"]},
|
||||||
)
|
)
|
||||||
|
|
||||||
history = build_conversation_history(final_context)
|
history, tokens = build_conversation_history(final_context)
|
||||||
|
|
||||||
# Verify chronological order and speaker identification
|
# Verify chronological order and speaker identification
|
||||||
assert "--- Turn 1 (Gemini using analyze) ---" in history
|
assert "--- Turn 1 (Gemini using analyze) ---" in history
|
||||||
@@ -670,7 +671,7 @@ class TestConversationFlow:
|
|||||||
mock_client.get.return_value = context_with_followup.model_dump_json()
|
mock_client.get.return_value = context_with_followup.model_dump_json()
|
||||||
|
|
||||||
# Build history to verify follow-up is preserved
|
# Build history to verify follow-up is preserved
|
||||||
history = build_conversation_history(context_with_followup)
|
history, tokens = build_conversation_history(context_with_followup)
|
||||||
assert "Found potential issue in authentication" in history
|
assert "Found potential issue in authentication" in history
|
||||||
assert "[Gemini's Follow-up: Should I examine the authentication middleware?]" in history
|
assert "[Gemini's Follow-up: Should I examine the authentication middleware?]" in history
|
||||||
|
|
||||||
@@ -762,7 +763,7 @@ class TestConversationFlow:
|
|||||||
)
|
)
|
||||||
|
|
||||||
# Build conversation history (should handle token limits gracefully)
|
# Build conversation history (should handle token limits gracefully)
|
||||||
history = build_conversation_history(context)
|
history, tokens = build_conversation_history(context)
|
||||||
|
|
||||||
# Verify the history was built successfully
|
# Verify the history was built successfully
|
||||||
assert "=== CONVERSATION HISTORY ===" in history
|
assert "=== CONVERSATION HISTORY ===" in history
|
||||||
|
|||||||
@@ -186,8 +186,8 @@ class TestCrossToolContinuation:
|
|||||||
response = await self.review_tool.execute(arguments)
|
response = await self.review_tool.execute(arguments)
|
||||||
response_data = json.loads(response[0].text)
|
response_data = json.loads(response[0].text)
|
||||||
|
|
||||||
# Should successfully continue the conversation
|
# Should offer continuation since there are remaining turns available
|
||||||
assert response_data["status"] == "success"
|
assert response_data["status"] == "continuation_available"
|
||||||
assert "Critical security vulnerability confirmed" in response_data["content"]
|
assert "Critical security vulnerability confirmed" in response_data["content"]
|
||||||
|
|
||||||
# Step 4: Verify the cross-tool continuation worked
|
# Step 4: Verify the cross-tool continuation worked
|
||||||
@@ -247,7 +247,7 @@ class TestCrossToolContinuation:
|
|||||||
# Build conversation history
|
# Build conversation history
|
||||||
from utils.conversation_memory import build_conversation_history
|
from utils.conversation_memory import build_conversation_history
|
||||||
|
|
||||||
history = build_conversation_history(thread_context)
|
history, tokens = build_conversation_history(thread_context)
|
||||||
|
|
||||||
# Verify tool names are included in the history
|
# Verify tool names are included in the history
|
||||||
assert "Turn 1 (Gemini using test_analysis)" in history
|
assert "Turn 1 (Gemini using test_analysis)" in history
|
||||||
@@ -307,7 +307,7 @@ class TestCrossToolContinuation:
|
|||||||
response = await self.review_tool.execute(arguments)
|
response = await self.review_tool.execute(arguments)
|
||||||
response_data = json.loads(response[0].text)
|
response_data = json.loads(response[0].text)
|
||||||
|
|
||||||
assert response_data["status"] == "success"
|
assert response_data["status"] == "continuation_available"
|
||||||
|
|
||||||
# Verify files from both tools are tracked in Redis calls
|
# Verify files from both tools are tracked in Redis calls
|
||||||
setex_calls = mock_client.setex.call_args_list
|
setex_calls = mock_client.setex.call_args_list
|
||||||
|
|||||||
@@ -214,15 +214,15 @@ class TestLargePromptHandling:
|
|||||||
mock_model.generate_content.return_value = mock_response
|
mock_model.generate_content.return_value = mock_response
|
||||||
mock_create_model.return_value = mock_model
|
mock_create_model.return_value = mock_model
|
||||||
|
|
||||||
# Mock read_files to avoid file system access
|
# Mock the centralized file preparation method to avoid file system access
|
||||||
with patch("tools.chat.read_files") as mock_read_files:
|
with patch.object(tool, "_prepare_file_content_for_prompt") as mock_prepare_files:
|
||||||
mock_read_files.return_value = "File content"
|
mock_prepare_files.return_value = "File content"
|
||||||
|
|
||||||
await tool.execute({"prompt": "", "files": [temp_prompt_file, other_file]})
|
await tool.execute({"prompt": "", "files": [temp_prompt_file, other_file]})
|
||||||
|
|
||||||
# Verify prompt.txt was removed from files list
|
# Verify prompt.txt was removed from files list
|
||||||
mock_read_files.assert_called_once()
|
mock_prepare_files.assert_called_once()
|
||||||
files_arg = mock_read_files.call_args[0][0]
|
files_arg = mock_prepare_files.call_args[0][0]
|
||||||
assert len(files_arg) == 1
|
assert len(files_arg) == 1
|
||||||
assert files_arg[0] == other_file
|
assert files_arg[0] == other_file
|
||||||
|
|
||||||
|
|||||||
@@ -228,10 +228,8 @@ class TestPrecommitTool:
|
|||||||
@patch("tools.precommit.find_git_repositories")
|
@patch("tools.precommit.find_git_repositories")
|
||||||
@patch("tools.precommit.get_git_status")
|
@patch("tools.precommit.get_git_status")
|
||||||
@patch("tools.precommit.run_git_command")
|
@patch("tools.precommit.run_git_command")
|
||||||
@patch("tools.precommit.read_files")
|
|
||||||
async def test_files_parameter_with_context(
|
async def test_files_parameter_with_context(
|
||||||
self,
|
self,
|
||||||
mock_read_files,
|
|
||||||
mock_run_git,
|
mock_run_git,
|
||||||
mock_status,
|
mock_status,
|
||||||
mock_find_repos,
|
mock_find_repos,
|
||||||
@@ -254,14 +252,15 @@ class TestPrecommitTool:
|
|||||||
(True, ""), # unstaged files list (empty)
|
(True, ""), # unstaged files list (empty)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Mock read_files
|
# Mock the centralized file preparation method
|
||||||
mock_read_files.return_value = "=== FILE: config.py ===\nCONFIG_VALUE = 42\n=== END FILE ==="
|
with patch.object(tool, "_prepare_file_content_for_prompt") as mock_prepare_files:
|
||||||
|
mock_prepare_files.return_value = "=== FILE: config.py ===\nCONFIG_VALUE = 42\n=== END FILE ==="
|
||||||
|
|
||||||
request = PrecommitRequest(
|
request = PrecommitRequest(
|
||||||
path="/absolute/repo/path",
|
path="/absolute/repo/path",
|
||||||
files=["/absolute/repo/path/config.py"],
|
files=["/absolute/repo/path/config.py"],
|
||||||
)
|
)
|
||||||
result = await tool.prepare_prompt(request)
|
result = await tool.prepare_prompt(request)
|
||||||
|
|
||||||
# Verify context files are included
|
# Verify context files are included
|
||||||
assert "## Context Files Summary" in result
|
assert "## Context Files Summary" in result
|
||||||
@@ -316,9 +315,9 @@ class TestPrecommitTool:
|
|||||||
(True, ""), # unstaged files (empty)
|
(True, ""), # unstaged files (empty)
|
||||||
]
|
]
|
||||||
|
|
||||||
# Mock read_files to return empty (file not found)
|
# Mock the centralized file preparation method to return empty (file not found)
|
||||||
with patch("tools.precommit.read_files") as mock_read:
|
with patch.object(tool, "_prepare_file_content_for_prompt") as mock_prepare_files:
|
||||||
mock_read.return_value = ""
|
mock_prepare_files.return_value = ""
|
||||||
result_with_files = await tool.prepare_prompt(request_with_files)
|
result_with_files = await tool.prepare_prompt(request_with_files)
|
||||||
|
|
||||||
assert "If you need additional context files" not in result_with_files
|
assert "If you need additional context files" not in result_with_files
|
||||||
|
|||||||
269
tests/test_precommit_with_mock_store.py
Normal file
269
tests/test_precommit_with_mock_store.py
Normal file
@@ -0,0 +1,269 @@
|
|||||||
|
"""
|
||||||
|
Enhanced tests for precommit tool using mock storage to test real logic
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import tempfile
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
from unittest.mock import patch
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from tools.precommit import Precommit, PrecommitRequest
|
||||||
|
|
||||||
|
|
||||||
|
class MockRedisClient:
|
||||||
|
"""Mock Redis client that uses in-memory dictionary storage"""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.data: dict[str, str] = {}
|
||||||
|
self.ttl_data: dict[str, int] = {}
|
||||||
|
|
||||||
|
def get(self, key: str) -> Optional[str]:
|
||||||
|
return self.data.get(key)
|
||||||
|
|
||||||
|
def set(self, key: str, value: str, ex: Optional[int] = None) -> bool:
|
||||||
|
self.data[key] = value
|
||||||
|
if ex:
|
||||||
|
self.ttl_data[key] = ex
|
||||||
|
return True
|
||||||
|
|
||||||
|
def delete(self, key: str) -> int:
|
||||||
|
if key in self.data:
|
||||||
|
del self.data[key]
|
||||||
|
self.ttl_data.pop(key, None)
|
||||||
|
return 1
|
||||||
|
return 0
|
||||||
|
|
||||||
|
def exists(self, key: str) -> int:
|
||||||
|
return 1 if key in self.data else 0
|
||||||
|
|
||||||
|
def setex(self, key: str, time: int, value: str) -> bool:
|
||||||
|
"""Set key to hold string value and set key to timeout after given seconds"""
|
||||||
|
self.data[key] = value
|
||||||
|
self.ttl_data[key] = time
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
class TestPrecommitToolWithMockStore:
|
||||||
|
"""Test precommit tool with mock storage to validate actual logic"""
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def mock_redis(self):
|
||||||
|
"""Create mock Redis client"""
|
||||||
|
return MockRedisClient()
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def tool(self, mock_redis, temp_repo):
|
||||||
|
"""Create tool instance with mocked Redis"""
|
||||||
|
temp_dir, _ = temp_repo
|
||||||
|
tool = Precommit()
|
||||||
|
|
||||||
|
# Mock the Redis client getter and PROJECT_ROOT to allow access to temp files
|
||||||
|
with (
|
||||||
|
patch("utils.conversation_memory.get_redis_client", return_value=mock_redis),
|
||||||
|
patch("utils.file_utils.PROJECT_ROOT", Path(temp_dir).resolve()),
|
||||||
|
):
|
||||||
|
yield tool
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def temp_repo(self):
|
||||||
|
"""Create a temporary git repository with test files"""
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
temp_dir = tempfile.mkdtemp()
|
||||||
|
|
||||||
|
# Initialize git repo
|
||||||
|
subprocess.run(["git", "init"], cwd=temp_dir, capture_output=True)
|
||||||
|
subprocess.run(["git", "config", "user.name", "Test"], cwd=temp_dir, capture_output=True)
|
||||||
|
subprocess.run(["git", "config", "user.email", "test@example.com"], cwd=temp_dir, capture_output=True)
|
||||||
|
|
||||||
|
# Create test config file
|
||||||
|
config_content = '''"""Test configuration file"""
|
||||||
|
|
||||||
|
# Version and metadata
|
||||||
|
__version__ = "1.0.0"
|
||||||
|
__author__ = "Test"
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
MAX_CONTENT_TOKENS = 800_000 # 800K tokens for content
|
||||||
|
TEMPERATURE_ANALYTICAL = 0.2 # For code review, debugging
|
||||||
|
'''
|
||||||
|
|
||||||
|
config_path = os.path.join(temp_dir, "config.py")
|
||||||
|
with open(config_path, "w") as f:
|
||||||
|
f.write(config_content)
|
||||||
|
|
||||||
|
# Add and commit initial version
|
||||||
|
subprocess.run(["git", "add", "."], cwd=temp_dir, capture_output=True)
|
||||||
|
subprocess.run(["git", "commit", "-m", "Initial commit"], cwd=temp_dir, capture_output=True)
|
||||||
|
|
||||||
|
# Modify config to create a diff
|
||||||
|
modified_content = config_content + '\nNEW_SETTING = "test" # Added setting\n'
|
||||||
|
with open(config_path, "w") as f:
|
||||||
|
f.write(modified_content)
|
||||||
|
|
||||||
|
yield temp_dir, config_path
|
||||||
|
|
||||||
|
# Cleanup
|
||||||
|
import shutil
|
||||||
|
|
||||||
|
shutil.rmtree(temp_dir)
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_no_duplicate_file_content_in_prompt(self, tool, temp_repo, mock_redis):
|
||||||
|
"""Test that file content appears in expected locations
|
||||||
|
|
||||||
|
This test validates our design decision that files can legitimately appear in both:
|
||||||
|
1. Git Diffs section: Shows only changed lines + limited context (wrapped with BEGIN DIFF markers)
|
||||||
|
2. Additional Context section: Shows complete file content (wrapped with BEGIN FILE markers)
|
||||||
|
|
||||||
|
This is intentional, not a bug - the AI needs both perspectives for comprehensive analysis.
|
||||||
|
"""
|
||||||
|
temp_dir, config_path = temp_repo
|
||||||
|
|
||||||
|
# Create request with files parameter
|
||||||
|
request = PrecommitRequest(path=temp_dir, files=[config_path], original_request="Test configuration changes")
|
||||||
|
|
||||||
|
# Generate the prompt
|
||||||
|
prompt = await tool.prepare_prompt(request)
|
||||||
|
|
||||||
|
# Verify expected sections are present
|
||||||
|
assert "## Original Request" in prompt
|
||||||
|
assert "Test configuration changes" in prompt
|
||||||
|
assert "## Additional Context Files" in prompt
|
||||||
|
assert "## Git Diffs" in prompt
|
||||||
|
|
||||||
|
# Verify the file appears in the git diff
|
||||||
|
assert "config.py" in prompt
|
||||||
|
assert "NEW_SETTING" in prompt
|
||||||
|
|
||||||
|
# Note: Files can legitimately appear in both git diff AND additional context:
|
||||||
|
# - Git diff shows only changed lines + limited context
|
||||||
|
# - Additional context provides complete file content for full understanding
|
||||||
|
# This is intentional and provides comprehensive context to the AI
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_conversation_memory_integration(self, tool, temp_repo, mock_redis):
|
||||||
|
"""Test that conversation memory works with mock storage"""
|
||||||
|
temp_dir, config_path = temp_repo
|
||||||
|
|
||||||
|
# Mock conversation memory functions to use our mock redis
|
||||||
|
with patch("utils.conversation_memory.get_redis_client", return_value=mock_redis):
|
||||||
|
# First request - should embed file content
|
||||||
|
PrecommitRequest(path=temp_dir, files=[config_path], original_request="First review")
|
||||||
|
|
||||||
|
# Simulate conversation thread creation
|
||||||
|
from utils.conversation_memory import add_turn, create_thread
|
||||||
|
|
||||||
|
thread_id = create_thread("precommit", {"files": [config_path]})
|
||||||
|
|
||||||
|
# Test that file embedding works
|
||||||
|
files_to_embed = tool.filter_new_files([config_path], None)
|
||||||
|
assert config_path in files_to_embed, "New conversation should embed all files"
|
||||||
|
|
||||||
|
# Add a turn to the conversation
|
||||||
|
add_turn(thread_id, "assistant", "First response", files=[config_path], tool_name="precommit")
|
||||||
|
|
||||||
|
# Second request with continuation - should skip already embedded files
|
||||||
|
PrecommitRequest(
|
||||||
|
path=temp_dir, files=[config_path], continuation_id=thread_id, original_request="Follow-up review"
|
||||||
|
)
|
||||||
|
|
||||||
|
files_to_embed_2 = tool.filter_new_files([config_path], thread_id)
|
||||||
|
assert len(files_to_embed_2) == 0, "Continuation should skip already embedded files"
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_prompt_structure_integrity(self, tool, temp_repo, mock_redis):
|
||||||
|
"""Test that the prompt structure is well-formed and doesn't have content duplication"""
|
||||||
|
temp_dir, config_path = temp_repo
|
||||||
|
|
||||||
|
request = PrecommitRequest(
|
||||||
|
path=temp_dir,
|
||||||
|
files=[config_path],
|
||||||
|
original_request="Validate prompt structure",
|
||||||
|
review_type="full",
|
||||||
|
severity_filter="high",
|
||||||
|
)
|
||||||
|
|
||||||
|
prompt = await tool.prepare_prompt(request)
|
||||||
|
|
||||||
|
# Split prompt into sections
|
||||||
|
sections = {
|
||||||
|
"original_request": "## Original Request",
|
||||||
|
"review_parameters": "## Review Parameters",
|
||||||
|
"repo_summary": "## Repository Changes Summary",
|
||||||
|
"context_files_summary": "## Context Files Summary",
|
||||||
|
"git_diffs": "## Git Diffs",
|
||||||
|
"additional_context": "## Additional Context Files",
|
||||||
|
"review_instructions": "## Review Instructions",
|
||||||
|
}
|
||||||
|
|
||||||
|
section_indices = {}
|
||||||
|
for name, header in sections.items():
|
||||||
|
index = prompt.find(header)
|
||||||
|
if index != -1:
|
||||||
|
section_indices[name] = index
|
||||||
|
|
||||||
|
# Verify sections appear in logical order
|
||||||
|
assert section_indices["original_request"] < section_indices["review_parameters"]
|
||||||
|
assert section_indices["review_parameters"] < section_indices["repo_summary"]
|
||||||
|
assert section_indices["git_diffs"] < section_indices["additional_context"]
|
||||||
|
assert section_indices["additional_context"] < section_indices["review_instructions"]
|
||||||
|
|
||||||
|
# Test that file content only appears in Additional Context section
|
||||||
|
file_content_start = section_indices["additional_context"]
|
||||||
|
file_content_end = section_indices["review_instructions"]
|
||||||
|
|
||||||
|
file_section = prompt[file_content_start:file_content_end]
|
||||||
|
prompt[:file_content_start]
|
||||||
|
after_file_section = prompt[file_content_end:]
|
||||||
|
|
||||||
|
# File content should appear in the file section
|
||||||
|
assert "MAX_CONTENT_TOKENS = 800_000" in file_section
|
||||||
|
# Check that configuration content appears in the file section
|
||||||
|
assert "# Configuration" in file_section
|
||||||
|
# The complete file content should not appear in the review instructions
|
||||||
|
assert '__version__ = "1.0.0"' in file_section
|
||||||
|
assert '__version__ = "1.0.0"' not in after_file_section
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_file_content_formatting(self, tool, temp_repo, mock_redis):
|
||||||
|
"""Test that file content is properly formatted without duplication"""
|
||||||
|
temp_dir, config_path = temp_repo
|
||||||
|
|
||||||
|
# Test the centralized file preparation method directly
|
||||||
|
file_content = tool._prepare_file_content_for_prompt(
|
||||||
|
[config_path], None, "Test files", max_tokens=100000, reserve_tokens=1000 # No continuation
|
||||||
|
)
|
||||||
|
|
||||||
|
# Should contain file markers
|
||||||
|
assert "--- BEGIN FILE:" in file_content
|
||||||
|
assert "--- END FILE:" in file_content
|
||||||
|
assert "config.py" in file_content
|
||||||
|
|
||||||
|
# Should contain actual file content
|
||||||
|
assert "MAX_CONTENT_TOKENS = 800_000" in file_content
|
||||||
|
assert '__version__ = "1.0.0"' in file_content
|
||||||
|
|
||||||
|
# Content should appear only once
|
||||||
|
assert file_content.count("MAX_CONTENT_TOKENS = 800_000") == 1
|
||||||
|
assert file_content.count('__version__ = "1.0.0"') == 1
|
||||||
|
|
||||||
|
|
||||||
|
def test_mock_redis_basic_operations():
|
||||||
|
"""Test that our mock Redis implementation works correctly"""
|
||||||
|
mock_redis = MockRedisClient()
|
||||||
|
|
||||||
|
# Test basic operations
|
||||||
|
assert mock_redis.get("nonexistent") is None
|
||||||
|
assert mock_redis.exists("nonexistent") == 0
|
||||||
|
|
||||||
|
mock_redis.set("test_key", "test_value")
|
||||||
|
assert mock_redis.get("test_key") == "test_value"
|
||||||
|
assert mock_redis.exists("test_key") == 1
|
||||||
|
|
||||||
|
assert mock_redis.delete("test_key") == 1
|
||||||
|
assert mock_redis.get("test_key") is None
|
||||||
|
assert mock_redis.delete("test_key") == 0 # Already deleted
|
||||||
@@ -67,16 +67,16 @@ class TestPromptRegression:
|
|||||||
mock_model.generate_content.return_value = mock_model_response()
|
mock_model.generate_content.return_value = mock_model_response()
|
||||||
mock_create_model.return_value = mock_model
|
mock_create_model.return_value = mock_model
|
||||||
|
|
||||||
# Mock file reading
|
# Mock file reading through the centralized method
|
||||||
with patch("tools.chat.read_files") as mock_read_files:
|
with patch.object(tool, "_prepare_file_content_for_prompt") as mock_prepare_files:
|
||||||
mock_read_files.return_value = "File content here"
|
mock_prepare_files.return_value = "File content here"
|
||||||
|
|
||||||
result = await tool.execute({"prompt": "Analyze this code", "files": ["/path/to/file.py"]})
|
result = await tool.execute({"prompt": "Analyze this code", "files": ["/path/to/file.py"]})
|
||||||
|
|
||||||
assert len(result) == 1
|
assert len(result) == 1
|
||||||
output = json.loads(result[0].text)
|
output = json.loads(result[0].text)
|
||||||
assert output["status"] == "success"
|
assert output["status"] == "success"
|
||||||
mock_read_files.assert_called_once_with(["/path/to/file.py"])
|
mock_prepare_files.assert_called_once_with(["/path/to/file.py"], None, "Context files")
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_thinkdeep_normal_analysis(self, mock_model_response):
|
async def test_thinkdeep_normal_analysis(self, mock_model_response):
|
||||||
|
|||||||
@@ -42,6 +42,8 @@ class AnalyzeTool(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_input_schema(self) -> dict[str, Any]:
|
def get_input_schema(self) -> dict[str, Any]:
|
||||||
|
from config import DEFAULT_MODEL
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -50,6 +52,10 @@ class AnalyzeTool(BaseTool):
|
|||||||
"items": {"type": "string"},
|
"items": {"type": "string"},
|
||||||
"description": "Files or directories to analyze (must be absolute paths)",
|
"description": "Files or directories to analyze (must be absolute paths)",
|
||||||
},
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string",
|
||||||
|
"description": f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
},
|
||||||
"question": {
|
"question": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "What to analyze or look for",
|
"description": "What to analyze or look for",
|
||||||
|
|||||||
183
tools/base.py
183
tools/base.py
@@ -25,7 +25,7 @@ from google.genai import types
|
|||||||
from mcp.types import TextContent
|
from mcp.types import TextContent
|
||||||
from pydantic import BaseModel, Field
|
from pydantic import BaseModel, Field
|
||||||
|
|
||||||
from config import GEMINI_MODEL, MAX_CONTEXT_TOKENS, MCP_PROMPT_SIZE_LIMIT
|
from config import DEFAULT_MODEL, MAX_CONTEXT_TOKENS, MCP_PROMPT_SIZE_LIMIT
|
||||||
from utils import check_token_limit
|
from utils import check_token_limit
|
||||||
from utils.conversation_memory import (
|
from utils.conversation_memory import (
|
||||||
MAX_CONVERSATION_TURNS,
|
MAX_CONVERSATION_TURNS,
|
||||||
@@ -50,7 +50,10 @@ class ToolRequest(BaseModel):
|
|||||||
these common fields.
|
these common fields.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
model: Optional[str] = Field(None, description="Model to use (defaults to Gemini 2.5 Pro)")
|
model: Optional[str] = Field(
|
||||||
|
None,
|
||||||
|
description=f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
)
|
||||||
temperature: Optional[float] = Field(None, description="Temperature for response (tool-specific defaults)")
|
temperature: Optional[float] = Field(None, description="Temperature for response (tool-specific defaults)")
|
||||||
# Thinking mode controls how much computational budget the model uses for reasoning
|
# Thinking mode controls how much computational budget the model uses for reasoning
|
||||||
# Higher values allow for more complex reasoning but increase latency and cost
|
# Higher values allow for more complex reasoning but increase latency and cost
|
||||||
@@ -189,15 +192,18 @@ class BaseTool(ABC):
|
|||||||
# Thread not found, no files embedded
|
# Thread not found, no files embedded
|
||||||
return []
|
return []
|
||||||
|
|
||||||
return get_conversation_file_list(thread_context)
|
embedded_files = get_conversation_file_list(thread_context)
|
||||||
|
logger.debug(f"[FILES] {self.name}: Found {len(embedded_files)} embedded files")
|
||||||
|
return embedded_files
|
||||||
|
|
||||||
def filter_new_files(self, requested_files: list[str], continuation_id: Optional[str]) -> list[str]:
|
def filter_new_files(self, requested_files: list[str], continuation_id: Optional[str]) -> list[str]:
|
||||||
"""
|
"""
|
||||||
Filter out files that are already embedded in conversation history.
|
Filter out files that are already embedded in conversation history.
|
||||||
|
|
||||||
This method takes a list of requested files and removes any that have
|
This method prevents duplicate file embeddings by filtering out files that have
|
||||||
already been embedded in the conversation history, preventing duplicate
|
already been embedded in the conversation history. This optimizes token usage
|
||||||
file embeddings and optimizing token usage.
|
while ensuring tools still have logical access to all requested files through
|
||||||
|
conversation history references.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
requested_files: List of files requested for current tool execution
|
requested_files: List of files requested for current tool execution
|
||||||
@@ -206,19 +212,64 @@ class BaseTool(ABC):
|
|||||||
Returns:
|
Returns:
|
||||||
list[str]: List of files that need to be embedded (not already in history)
|
list[str]: List of files that need to be embedded (not already in history)
|
||||||
"""
|
"""
|
||||||
|
logger.debug(f"[FILES] {self.name}: Filtering {len(requested_files)} requested files")
|
||||||
|
|
||||||
if not continuation_id:
|
if not continuation_id:
|
||||||
# New conversation, all files are new
|
# New conversation, all files are new
|
||||||
|
logger.debug(f"[FILES] {self.name}: New conversation, all {len(requested_files)} files are new")
|
||||||
return requested_files
|
return requested_files
|
||||||
|
|
||||||
embedded_files = set(self.get_conversation_embedded_files(continuation_id))
|
try:
|
||||||
|
embedded_files = set(self.get_conversation_embedded_files(continuation_id))
|
||||||
|
logger.debug(f"[FILES] {self.name}: Found {len(embedded_files)} embedded files in conversation")
|
||||||
|
|
||||||
# Return only files that haven't been embedded yet
|
# Safety check: If no files are marked as embedded but we have a continuation_id,
|
||||||
new_files = [f for f in requested_files if f not in embedded_files]
|
# this might indicate an issue with conversation history. Be conservative.
|
||||||
|
if not embedded_files:
|
||||||
|
logger.debug(
|
||||||
|
f"📁 {self.name} tool: No files found in conversation history for thread {continuation_id}"
|
||||||
|
)
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] {self.name}: No embedded files found, returning all {len(requested_files)} requested files"
|
||||||
|
)
|
||||||
|
return requested_files
|
||||||
|
|
||||||
return new_files
|
# Return only files that haven't been embedded yet
|
||||||
|
new_files = [f for f in requested_files if f not in embedded_files]
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] {self.name}: After filtering: {len(new_files)} new files, {len(requested_files) - len(new_files)} already embedded"
|
||||||
|
)
|
||||||
|
logger.debug(f"[FILES] {self.name}: New files to embed: {new_files}")
|
||||||
|
|
||||||
|
# Log filtering results for debugging
|
||||||
|
if len(new_files) < len(requested_files):
|
||||||
|
skipped = [f for f in requested_files if f in embedded_files]
|
||||||
|
logger.debug(
|
||||||
|
f"📁 {self.name} tool: Filtering {len(skipped)} files already in conversation history: {', '.join(skipped)}"
|
||||||
|
)
|
||||||
|
logger.debug(f"[FILES] {self.name}: Skipped (already embedded): {skipped}")
|
||||||
|
|
||||||
|
return new_files
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
# If there's any issue with conversation history lookup, be conservative
|
||||||
|
# and include all files rather than risk losing access to needed files
|
||||||
|
logger.warning(f"📁 {self.name} tool: Error checking conversation history for {continuation_id}: {e}")
|
||||||
|
logger.warning(f"📁 {self.name} tool: Including all requested files as fallback")
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] {self.name}: Exception in filter_new_files, returning all {len(requested_files)} files as fallback"
|
||||||
|
)
|
||||||
|
return requested_files
|
||||||
|
|
||||||
def _prepare_file_content_for_prompt(
|
def _prepare_file_content_for_prompt(
|
||||||
self, request_files: list[str], continuation_id: Optional[str], context_description: str = "New files"
|
self,
|
||||||
|
request_files: list[str],
|
||||||
|
continuation_id: Optional[str],
|
||||||
|
context_description: str = "New files",
|
||||||
|
max_tokens: Optional[int] = None,
|
||||||
|
reserve_tokens: int = 1_000,
|
||||||
|
remaining_budget: Optional[int] = None,
|
||||||
|
arguments: Optional[dict] = None,
|
||||||
) -> str:
|
) -> str:
|
||||||
"""
|
"""
|
||||||
Centralized file processing for tool prompts.
|
Centralized file processing for tool prompts.
|
||||||
@@ -232,6 +283,10 @@ class BaseTool(ABC):
|
|||||||
request_files: List of files requested for current tool execution
|
request_files: List of files requested for current tool execution
|
||||||
continuation_id: Thread continuation ID, or None for new conversations
|
continuation_id: Thread continuation ID, or None for new conversations
|
||||||
context_description: Description for token limit validation (e.g. "Code", "New files")
|
context_description: Description for token limit validation (e.g. "Code", "New files")
|
||||||
|
max_tokens: Maximum tokens to use (defaults to remaining budget or MAX_CONTENT_TOKENS)
|
||||||
|
reserve_tokens: Tokens to reserve for additional prompt content (default 1K)
|
||||||
|
remaining_budget: Remaining token budget after conversation history (from server.py)
|
||||||
|
arguments: Original tool arguments (used to extract _remaining_tokens if available)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: Formatted file content string ready for prompt inclusion
|
str: Formatted file content string ready for prompt inclusion
|
||||||
@@ -239,15 +294,40 @@ class BaseTool(ABC):
|
|||||||
if not request_files:
|
if not request_files:
|
||||||
return ""
|
return ""
|
||||||
|
|
||||||
|
# Extract remaining budget from arguments if available
|
||||||
|
if remaining_budget is None:
|
||||||
|
# Use provided arguments or fall back to stored arguments from execute()
|
||||||
|
args_to_use = arguments or getattr(self, "_current_arguments", {})
|
||||||
|
remaining_budget = args_to_use.get("_remaining_tokens")
|
||||||
|
|
||||||
|
# Use remaining budget if provided, otherwise fall back to max_tokens or default
|
||||||
|
if remaining_budget is not None:
|
||||||
|
effective_max_tokens = remaining_budget - reserve_tokens
|
||||||
|
elif max_tokens is not None:
|
||||||
|
effective_max_tokens = max_tokens - reserve_tokens
|
||||||
|
else:
|
||||||
|
from config import MAX_CONTENT_TOKENS
|
||||||
|
|
||||||
|
effective_max_tokens = MAX_CONTENT_TOKENS - reserve_tokens
|
||||||
|
|
||||||
|
# Ensure we have a reasonable minimum budget
|
||||||
|
effective_max_tokens = max(1000, effective_max_tokens)
|
||||||
|
|
||||||
files_to_embed = self.filter_new_files(request_files, continuation_id)
|
files_to_embed = self.filter_new_files(request_files, continuation_id)
|
||||||
|
logger.debug(f"[FILES] {self.name}: Will embed {len(files_to_embed)} files after filtering")
|
||||||
|
|
||||||
content_parts = []
|
content_parts = []
|
||||||
|
|
||||||
# Read content of new files only
|
# Read content of new files only
|
||||||
if files_to_embed:
|
if files_to_embed:
|
||||||
logger.debug(f"📁 {self.name} tool embedding {len(files_to_embed)} new files: {', '.join(files_to_embed)}")
|
logger.debug(f"📁 {self.name} tool embedding {len(files_to_embed)} new files: {', '.join(files_to_embed)}")
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] {self.name}: Starting file embedding with token budget {effective_max_tokens + reserve_tokens:,}"
|
||||||
|
)
|
||||||
try:
|
try:
|
||||||
file_content = read_files(files_to_embed)
|
file_content = read_files(
|
||||||
|
files_to_embed, max_tokens=effective_max_tokens + reserve_tokens, reserve_tokens=reserve_tokens
|
||||||
|
)
|
||||||
self._validate_token_limit(file_content, context_description)
|
self._validate_token_limit(file_content, context_description)
|
||||||
content_parts.append(file_content)
|
content_parts.append(file_content)
|
||||||
|
|
||||||
@@ -258,9 +338,13 @@ class BaseTool(ABC):
|
|||||||
logger.debug(
|
logger.debug(
|
||||||
f"📁 {self.name} tool successfully embedded {len(files_to_embed)} files ({content_tokens:,} tokens)"
|
f"📁 {self.name} tool successfully embedded {len(files_to_embed)} files ({content_tokens:,} tokens)"
|
||||||
)
|
)
|
||||||
|
logger.debug(f"[FILES] {self.name}: Successfully embedded files - {content_tokens:,} tokens used")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.error(f"📁 {self.name} tool failed to embed files {files_to_embed}: {type(e).__name__}: {e}")
|
logger.error(f"📁 {self.name} tool failed to embed files {files_to_embed}: {type(e).__name__}: {e}")
|
||||||
|
logger.debug(f"[FILES] {self.name}: File embedding failed - {type(e).__name__}: {e}")
|
||||||
raise
|
raise
|
||||||
|
else:
|
||||||
|
logger.debug(f"[FILES] {self.name}: No files to embed after filtering")
|
||||||
|
|
||||||
# Generate note about files already in conversation history
|
# Generate note about files already in conversation history
|
||||||
if continuation_id and len(files_to_embed) < len(request_files):
|
if continuation_id and len(files_to_embed) < len(request_files):
|
||||||
@@ -270,6 +354,7 @@ class BaseTool(ABC):
|
|||||||
logger.debug(
|
logger.debug(
|
||||||
f"📁 {self.name} tool skipping {len(skipped_files)} files already in conversation history: {', '.join(skipped_files)}"
|
f"📁 {self.name} tool skipping {len(skipped_files)} files already in conversation history: {', '.join(skipped_files)}"
|
||||||
)
|
)
|
||||||
|
logger.debug(f"[FILES] {self.name}: Adding note about {len(skipped_files)} skipped files")
|
||||||
if content_parts:
|
if content_parts:
|
||||||
content_parts.append("\n\n")
|
content_parts.append("\n\n")
|
||||||
note_lines = [
|
note_lines = [
|
||||||
@@ -279,8 +364,12 @@ class BaseTool(ABC):
|
|||||||
"--- END NOTE ---",
|
"--- END NOTE ---",
|
||||||
]
|
]
|
||||||
content_parts.append("\n".join(note_lines))
|
content_parts.append("\n".join(note_lines))
|
||||||
|
else:
|
||||||
|
logger.debug(f"[FILES] {self.name}: No skipped files to note")
|
||||||
|
|
||||||
return "".join(content_parts) if content_parts else ""
|
result = "".join(content_parts) if content_parts else ""
|
||||||
|
logger.debug(f"[FILES] {self.name}: _prepare_file_content_for_prompt returning {len(result)} chars")
|
||||||
|
return result
|
||||||
|
|
||||||
def get_websearch_instruction(self, use_websearch: bool, tool_specific: Optional[str] = None) -> str:
|
def get_websearch_instruction(self, use_websearch: bool, tool_specific: Optional[str] = None) -> str:
|
||||||
"""
|
"""
|
||||||
@@ -488,6 +577,9 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
List[TextContent]: Formatted response as MCP TextContent objects
|
List[TextContent]: Formatted response as MCP TextContent objects
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
|
# Store arguments for access by helper methods (like _prepare_file_content_for_prompt)
|
||||||
|
self._current_arguments = arguments
|
||||||
|
|
||||||
# Set up logger for this tool execution
|
# Set up logger for this tool execution
|
||||||
logger = logging.getLogger(f"tools.{self.name}")
|
logger = logging.getLogger(f"tools.{self.name}")
|
||||||
logger.info(f"Starting {self.name} tool execution with arguments: {list(arguments.keys())}")
|
logger.info(f"Starting {self.name} tool execution with arguments: {list(arguments.keys())}")
|
||||||
@@ -536,7 +628,7 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
# No need to rebuild it here - prompt already contains conversation history
|
# No need to rebuild it here - prompt already contains conversation history
|
||||||
|
|
||||||
# Extract model configuration from request or use defaults
|
# Extract model configuration from request or use defaults
|
||||||
model_name = getattr(request, "model", None) or GEMINI_MODEL
|
model_name = getattr(request, "model", None) or DEFAULT_MODEL
|
||||||
temperature = getattr(request, "temperature", None)
|
temperature = getattr(request, "temperature", None)
|
||||||
if temperature is None:
|
if temperature is None:
|
||||||
temperature = self.get_default_temperature()
|
temperature = self.get_default_temperature()
|
||||||
@@ -580,11 +672,29 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
# Catch all exceptions to prevent server crashes
|
# Catch all exceptions to prevent server crashes
|
||||||
# Return error information in standardized format
|
# Return error information in standardized format
|
||||||
logger = logging.getLogger(f"tools.{self.name}")
|
logger = logging.getLogger(f"tools.{self.name}")
|
||||||
logger.error(f"Error in {self.name} tool execution: {str(e)}", exc_info=True)
|
error_msg = str(e)
|
||||||
|
|
||||||
|
# Check if this is a 500 INTERNAL error that asks for retry
|
||||||
|
if "500 INTERNAL" in error_msg and "Please retry" in error_msg:
|
||||||
|
logger.warning(f"500 INTERNAL error in {self.name} - attempting retry")
|
||||||
|
try:
|
||||||
|
# Single retry attempt
|
||||||
|
model = self._get_model_wrapper(request)
|
||||||
|
raw_response = await model.generate_content(prompt)
|
||||||
|
response = raw_response.text
|
||||||
|
|
||||||
|
# If successful, process normally
|
||||||
|
return [TextContent(type="text", text=self._process_response(response, request).model_dump_json())]
|
||||||
|
|
||||||
|
except Exception as retry_e:
|
||||||
|
logger.error(f"Retry failed for {self.name} tool: {str(retry_e)}")
|
||||||
|
error_msg = f"Tool failed after retry: {str(retry_e)}"
|
||||||
|
|
||||||
|
logger.error(f"Error in {self.name} tool execution: {error_msg}", exc_info=True)
|
||||||
|
|
||||||
error_output = ToolOutput(
|
error_output = ToolOutput(
|
||||||
status="error",
|
status="error",
|
||||||
content=f"Error in {self.name}: {str(e)}",
|
content=f"Error in {self.name}: {error_msg}",
|
||||||
content_type="text",
|
content_type="text",
|
||||||
)
|
)
|
||||||
return [TextContent(type="text", text=error_output.model_dump_json())]
|
return [TextContent(type="text", text=error_output.model_dump_json())]
|
||||||
@@ -811,18 +921,24 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
Returns:
|
Returns:
|
||||||
Dict with continuation data if opportunity should be offered, None otherwise
|
Dict with continuation data if opportunity should be offered, None otherwise
|
||||||
"""
|
"""
|
||||||
# Only offer continuation for new conversations (not already threaded)
|
|
||||||
continuation_id = getattr(request, "continuation_id", None)
|
continuation_id = getattr(request, "continuation_id", None)
|
||||||
if continuation_id:
|
|
||||||
# This is already a threaded conversation, don't offer continuation
|
|
||||||
# (either Gemini will ask follow-up or conversation naturally ends)
|
|
||||||
return None
|
|
||||||
|
|
||||||
# Only offer if we haven't reached conversation limits
|
|
||||||
try:
|
try:
|
||||||
# For new conversations, we have MAX_CONVERSATION_TURNS - 1 remaining
|
if continuation_id:
|
||||||
# (since this response will be turn 1)
|
# Check remaining turns in existing thread
|
||||||
remaining_turns = MAX_CONVERSATION_TURNS - 1
|
from utils.conversation_memory import get_thread
|
||||||
|
|
||||||
|
context = get_thread(continuation_id)
|
||||||
|
if context:
|
||||||
|
current_turns = len(context.turns)
|
||||||
|
remaining_turns = MAX_CONVERSATION_TURNS - current_turns - 1 # -1 for this response
|
||||||
|
else:
|
||||||
|
# Thread not found, don't offer continuation
|
||||||
|
return None
|
||||||
|
else:
|
||||||
|
# New conversation, we have MAX_CONVERSATION_TURNS - 1 remaining
|
||||||
|
# (since this response will be turn 1)
|
||||||
|
remaining_turns = MAX_CONVERSATION_TURNS - 1
|
||||||
|
|
||||||
if remaining_turns <= 0:
|
if remaining_turns <= 0:
|
||||||
return None
|
return None
|
||||||
@@ -951,13 +1067,22 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
temperature and thinking budget configuration for models that support it.
|
temperature and thinking budget configuration for models that support it.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_name: Name of the Gemini model to use
|
model_name: Name of the Gemini model to use (or shorthand like 'flash', 'pro')
|
||||||
temperature: Temperature setting for response generation
|
temperature: Temperature setting for response generation
|
||||||
thinking_mode: Thinking depth mode (affects computational budget)
|
thinking_mode: Thinking depth mode (affects computational budget)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Model instance configured and ready for generation
|
Model instance configured and ready for generation
|
||||||
"""
|
"""
|
||||||
|
# Define model shorthands for user convenience
|
||||||
|
model_shorthands = {
|
||||||
|
"pro": "gemini-2.5-pro-preview-06-05",
|
||||||
|
"flash": "gemini-2.0-flash-exp",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Resolve shorthand to full model name
|
||||||
|
resolved_model_name = model_shorthands.get(model_name.lower(), model_name)
|
||||||
|
|
||||||
# Map thinking modes to computational budget values
|
# Map thinking modes to computational budget values
|
||||||
# Higher budgets allow for more complex reasoning but increase latency
|
# Higher budgets allow for more complex reasoning but increase latency
|
||||||
thinking_budgets = {
|
thinking_budgets = {
|
||||||
@@ -972,7 +1097,7 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
|
|
||||||
# Gemini 2.5 models support thinking configuration for enhanced reasoning
|
# Gemini 2.5 models support thinking configuration for enhanced reasoning
|
||||||
# Skip special handling in test environment to allow mocking
|
# Skip special handling in test environment to allow mocking
|
||||||
if "2.5" in model_name and not os.environ.get("PYTEST_CURRENT_TEST"):
|
if "2.5" in resolved_model_name and not os.environ.get("PYTEST_CURRENT_TEST"):
|
||||||
try:
|
try:
|
||||||
# Retrieve API key for Gemini client creation
|
# Retrieve API key for Gemini client creation
|
||||||
api_key = os.environ.get("GEMINI_API_KEY")
|
api_key = os.environ.get("GEMINI_API_KEY")
|
||||||
@@ -1031,7 +1156,7 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
|
|
||||||
return ResponseWrapper(response.text)
|
return ResponseWrapper(response.text)
|
||||||
|
|
||||||
return ModelWrapper(client, model_name, temperature, thinking_budget)
|
return ModelWrapper(client, resolved_model_name, temperature, thinking_budget)
|
||||||
|
|
||||||
except Exception:
|
except Exception:
|
||||||
# Fall back to regular API if thinking configuration fails
|
# Fall back to regular API if thinking configuration fails
|
||||||
@@ -1084,4 +1209,4 @@ If any of these would strengthen your analysis, specify what Claude should searc
|
|||||||
|
|
||||||
return ResponseWrapper(response.text)
|
return ResponseWrapper(response.text)
|
||||||
|
|
||||||
return SimpleModelWrapper(client, model_name, temperature)
|
return SimpleModelWrapper(client, resolved_model_name, temperature)
|
||||||
|
|||||||
@@ -9,7 +9,6 @@ from pydantic import Field
|
|||||||
|
|
||||||
from config import TEMPERATURE_BALANCED
|
from config import TEMPERATURE_BALANCED
|
||||||
from prompts import CHAT_PROMPT
|
from prompts import CHAT_PROMPT
|
||||||
from utils import read_files
|
|
||||||
|
|
||||||
from .base import BaseTool, ToolRequest
|
from .base import BaseTool, ToolRequest
|
||||||
from .models import ToolOutput
|
from .models import ToolOutput
|
||||||
@@ -45,6 +44,8 @@ class ChatTool(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_input_schema(self) -> dict[str, Any]:
|
def get_input_schema(self) -> dict[str, Any]:
|
||||||
|
from config import DEFAULT_MODEL
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -57,6 +58,10 @@ class ChatTool(BaseTool):
|
|||||||
"items": {"type": "string"},
|
"items": {"type": "string"},
|
||||||
"description": "Optional files for context (must be absolute paths)",
|
"description": "Optional files for context (must be absolute paths)",
|
||||||
},
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string",
|
||||||
|
"description": f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
},
|
||||||
"temperature": {
|
"temperature": {
|
||||||
"type": "number",
|
"type": "number",
|
||||||
"description": "Response creativity (0-1, default 0.5)",
|
"description": "Response creativity (0-1, default 0.5)",
|
||||||
@@ -116,10 +121,13 @@ class ChatTool(BaseTool):
|
|||||||
if updated_files is not None:
|
if updated_files is not None:
|
||||||
request.files = updated_files
|
request.files = updated_files
|
||||||
|
|
||||||
# Add context files if provided
|
# Add context files if provided (using centralized file handling with filtering)
|
||||||
if request.files:
|
if request.files:
|
||||||
file_content = read_files(request.files)
|
file_content = self._prepare_file_content_for_prompt(
|
||||||
user_content = f"{user_content}\n\n=== CONTEXT FILES ===\n{file_content}\n=== END CONTEXT ===="
|
request.files, request.continuation_id, "Context files"
|
||||||
|
)
|
||||||
|
if file_content:
|
||||||
|
user_content = f"{user_content}\n\n=== CONTEXT FILES ===\n{file_content}\n=== END CONTEXT ===="
|
||||||
|
|
||||||
# Check token limits
|
# Check token limits
|
||||||
self._validate_token_limit(user_content, "Content")
|
self._validate_token_limit(user_content, "Content")
|
||||||
|
|||||||
@@ -79,6 +79,8 @@ class CodeReviewTool(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_input_schema(self) -> dict[str, Any]:
|
def get_input_schema(self) -> dict[str, Any]:
|
||||||
|
from config import DEFAULT_MODEL
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -87,6 +89,10 @@ class CodeReviewTool(BaseTool):
|
|||||||
"items": {"type": "string"},
|
"items": {"type": "string"},
|
||||||
"description": "Code files or directories to review (must be absolute paths)",
|
"description": "Code files or directories to review (must be absolute paths)",
|
||||||
},
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string",
|
||||||
|
"description": f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
},
|
||||||
"context": {
|
"context": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "User's summary of what the code does, expected behavior, constraints, and review objectives",
|
"description": "User's summary of what the code does, expected behavior, constraints, and review objectives",
|
||||||
|
|||||||
@@ -50,6 +50,8 @@ class DebugIssueTool(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_input_schema(self) -> dict[str, Any]:
|
def get_input_schema(self) -> dict[str, Any]:
|
||||||
|
from config import DEFAULT_MODEL
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -57,6 +59,10 @@ class DebugIssueTool(BaseTool):
|
|||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Error message, symptoms, or issue description",
|
"description": "Error message, symptoms, or issue description",
|
||||||
},
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string",
|
||||||
|
"description": f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
},
|
||||||
"error_context": {
|
"error_context": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Stack trace, logs, or additional error context",
|
"description": "Stack trace, logs, or additional error context",
|
||||||
|
|||||||
@@ -1,5 +1,11 @@
|
|||||||
"""
|
"""
|
||||||
Tool for pre-commit validation of git changes across multiple repositories.
|
Tool for pre-commit validation of git changes across multiple repositories.
|
||||||
|
|
||||||
|
Design Note - File Content in Multiple Sections:
|
||||||
|
Files may legitimately appear in both "Git Diffs" and "Additional Context Files" sections:
|
||||||
|
- Git Diffs: Shows changed lines + limited context (marked with "BEGIN DIFF" / "END DIFF")
|
||||||
|
- Additional Context: Shows complete file content (marked with "BEGIN FILE" / "END FILE")
|
||||||
|
This provides comprehensive context for AI analysis - not a duplication bug.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
@@ -10,7 +16,7 @@ from pydantic import Field
|
|||||||
|
|
||||||
from config import MAX_CONTEXT_TOKENS
|
from config import MAX_CONTEXT_TOKENS
|
||||||
from prompts.tool_prompts import PRECOMMIT_PROMPT
|
from prompts.tool_prompts import PRECOMMIT_PROMPT
|
||||||
from utils.file_utils import read_files, translate_file_paths, translate_path_for_environment
|
from utils.file_utils import translate_file_paths, translate_path_for_environment
|
||||||
from utils.git_utils import find_git_repositories, get_git_status, run_git_command
|
from utils.git_utils import find_git_repositories, get_git_status, run_git_command
|
||||||
from utils.token_utils import estimate_tokens
|
from utils.token_utils import estimate_tokens
|
||||||
|
|
||||||
@@ -92,7 +98,15 @@ class Precommit(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_input_schema(self) -> dict[str, Any]:
|
def get_input_schema(self) -> dict[str, Any]:
|
||||||
|
from config import DEFAULT_MODEL
|
||||||
|
|
||||||
schema = self.get_request_model().model_json_schema()
|
schema = self.get_request_model().model_json_schema()
|
||||||
|
# Ensure model parameter has enhanced description
|
||||||
|
if "properties" in schema and "model" in schema["properties"]:
|
||||||
|
schema["properties"]["model"] = {
|
||||||
|
"type": "string",
|
||||||
|
"description": f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
}
|
||||||
# Ensure use_websearch is in the schema with proper description
|
# Ensure use_websearch is in the schema with proper description
|
||||||
if "properties" in schema and "use_websearch" not in schema["properties"]:
|
if "properties" in schema and "use_websearch" not in schema["properties"]:
|
||||||
schema["properties"]["use_websearch"] = {
|
schema["properties"]["use_websearch"] = {
|
||||||
@@ -239,9 +253,12 @@ class Precommit(BaseTool):
|
|||||||
staged_files = [f for f in files_output.strip().split("\n") if f]
|
staged_files = [f for f in files_output.strip().split("\n") if f]
|
||||||
|
|
||||||
# Generate per-file diffs for staged changes
|
# Generate per-file diffs for staged changes
|
||||||
|
# Each diff is wrapped with clear markers to distinguish from full file content
|
||||||
for file_path in staged_files:
|
for file_path in staged_files:
|
||||||
success, diff = run_git_command(repo_path, ["diff", "--cached", "--", file_path])
|
success, diff = run_git_command(repo_path, ["diff", "--cached", "--", file_path])
|
||||||
if success and diff.strip():
|
if success and diff.strip():
|
||||||
|
# Use "BEGIN DIFF" markers (distinct from "BEGIN FILE" markers in utils/file_utils.py)
|
||||||
|
# This allows AI to distinguish between diff context vs complete file content
|
||||||
diff_header = f"\n--- BEGIN DIFF: {repo_name} / {file_path} (staged) ---\n"
|
diff_header = f"\n--- BEGIN DIFF: {repo_name} / {file_path} (staged) ---\n"
|
||||||
diff_footer = f"\n--- END DIFF: {repo_name} / {file_path} ---\n"
|
diff_footer = f"\n--- END DIFF: {repo_name} / {file_path} ---\n"
|
||||||
formatted_diff = diff_header + diff + diff_footer
|
formatted_diff = diff_header + diff + diff_footer
|
||||||
@@ -258,6 +275,7 @@ class Precommit(BaseTool):
|
|||||||
unstaged_files = [f for f in files_output.strip().split("\n") if f]
|
unstaged_files = [f for f in files_output.strip().split("\n") if f]
|
||||||
|
|
||||||
# Generate per-file diffs for unstaged changes
|
# Generate per-file diffs for unstaged changes
|
||||||
|
# Same clear marker pattern as staged changes above
|
||||||
for file_path in unstaged_files:
|
for file_path in unstaged_files:
|
||||||
success, diff = run_git_command(repo_path, ["diff", "--", file_path])
|
success, diff = run_git_command(repo_path, ["diff", "--", file_path])
|
||||||
if success and diff.strip():
|
if success and diff.strip():
|
||||||
@@ -298,10 +316,12 @@ class Precommit(BaseTool):
|
|||||||
if translated_files:
|
if translated_files:
|
||||||
remaining_tokens = max_tokens - total_tokens
|
remaining_tokens = max_tokens - total_tokens
|
||||||
|
|
||||||
# Use standardized file reading with token budget
|
# Use centralized file handling with filtering for duplicate prevention
|
||||||
file_content = read_files(
|
file_content = self._prepare_file_content_for_prompt(
|
||||||
translated_files,
|
translated_files,
|
||||||
max_tokens=remaining_tokens,
|
request.continuation_id,
|
||||||
|
"Context files",
|
||||||
|
max_tokens=remaining_tokens + 1000, # Add back the reserve that was calculated
|
||||||
reserve_tokens=1000, # Small reserve for formatting
|
reserve_tokens=1000, # Small reserve for formatting
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -370,7 +390,8 @@ class Precommit(BaseTool):
|
|||||||
if total_tokens > 0:
|
if total_tokens > 0:
|
||||||
prompt_parts.append(f"\nTotal context tokens used: ~{total_tokens:,}")
|
prompt_parts.append(f"\nTotal context tokens used: ~{total_tokens:,}")
|
||||||
|
|
||||||
# Add the diff contents
|
# Add the diff contents with clear section markers
|
||||||
|
# Each diff is wrapped with "--- BEGIN DIFF: ... ---" and "--- END DIFF: ... ---"
|
||||||
prompt_parts.append("\n## Git Diffs\n")
|
prompt_parts.append("\n## Git Diffs\n")
|
||||||
if all_diffs:
|
if all_diffs:
|
||||||
prompt_parts.extend(all_diffs)
|
prompt_parts.extend(all_diffs)
|
||||||
@@ -378,6 +399,11 @@ class Precommit(BaseTool):
|
|||||||
prompt_parts.append("--- NO DIFFS FOUND ---")
|
prompt_parts.append("--- NO DIFFS FOUND ---")
|
||||||
|
|
||||||
# Add context files content if provided
|
# Add context files content if provided
|
||||||
|
# IMPORTANT: Files may legitimately appear in BOTH sections:
|
||||||
|
# - Git Diffs: Show only changed lines + limited context (what changed)
|
||||||
|
# - Additional Context: Show complete file content (full understanding)
|
||||||
|
# This is intentional design for comprehensive AI analysis, not duplication bug.
|
||||||
|
# Each file in this section is wrapped with "--- BEGIN FILE: ... ---" and "--- END FILE: ... ---"
|
||||||
if context_files_content:
|
if context_files_content:
|
||||||
prompt_parts.append("\n## Additional Context Files")
|
prompt_parts.append("\n## Additional Context Files")
|
||||||
prompt_parts.append(
|
prompt_parts.append(
|
||||||
|
|||||||
@@ -48,6 +48,8 @@ class ThinkDeepTool(BaseTool):
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_input_schema(self) -> dict[str, Any]:
|
def get_input_schema(self) -> dict[str, Any]:
|
||||||
|
from config import DEFAULT_MODEL
|
||||||
|
|
||||||
return {
|
return {
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -55,6 +57,10 @@ class ThinkDeepTool(BaseTool):
|
|||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Your current thinking/analysis to extend and validate",
|
"description": "Your current thinking/analysis to extend and validate",
|
||||||
},
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string",
|
||||||
|
"description": f"Model to use: 'pro' (Gemini 2.5 Pro with extended thinking) or 'flash' (Gemini 2.0 Flash - faster). Defaults to '{DEFAULT_MODEL}' if not specified.",
|
||||||
|
},
|
||||||
"problem_context": {
|
"problem_context": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "Additional context about the problem or goal",
|
"description": "Additional context about the problem or goal",
|
||||||
@@ -78,8 +84,7 @@ class ThinkDeepTool(BaseTool):
|
|||||||
"thinking_mode": {
|
"thinking_mode": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"enum": ["minimal", "low", "medium", "high", "max"],
|
"enum": ["minimal", "low", "medium", "high", "max"],
|
||||||
"description": "Thinking depth: minimal (128), low (2048), medium (8192), high (16384), max (32768)",
|
"description": f"Thinking depth: minimal (128), low (2048), medium (8192), high (16384), max (32768). Defaults to '{self.get_default_thinking_mode()}' if not specified.",
|
||||||
"default": "high",
|
|
||||||
},
|
},
|
||||||
"use_websearch": {
|
"use_websearch": {
|
||||||
"type": "boolean",
|
"type": "boolean",
|
||||||
@@ -101,8 +106,10 @@ class ThinkDeepTool(BaseTool):
|
|||||||
return TEMPERATURE_CREATIVE
|
return TEMPERATURE_CREATIVE
|
||||||
|
|
||||||
def get_default_thinking_mode(self) -> str:
|
def get_default_thinking_mode(self) -> str:
|
||||||
"""ThinkDeep uses high thinking by default"""
|
"""ThinkDeep uses configurable thinking mode, defaults to high"""
|
||||||
return "high"
|
from config import DEFAULT_THINKING_MODE_THINKDEEP
|
||||||
|
|
||||||
|
return DEFAULT_THINKING_MODE_THINKDEEP
|
||||||
|
|
||||||
def get_request_model(self):
|
def get_request_model(self):
|
||||||
return ThinkDeepRequest
|
return ThinkDeepRequest
|
||||||
|
|||||||
@@ -250,12 +250,16 @@ def add_turn(
|
|||||||
- Turn limits prevent runaway conversations
|
- Turn limits prevent runaway conversations
|
||||||
- File references are preserved for cross-tool access
|
- File references are preserved for cross-tool access
|
||||||
"""
|
"""
|
||||||
|
logger.debug(f"[FLOW] Adding {role} turn to {thread_id} ({tool_name})")
|
||||||
|
|
||||||
context = get_thread(thread_id)
|
context = get_thread(thread_id)
|
||||||
if not context:
|
if not context:
|
||||||
|
logger.debug(f"[FLOW] Thread {thread_id} not found for turn addition")
|
||||||
return False
|
return False
|
||||||
|
|
||||||
# Check turn limit to prevent runaway conversations
|
# Check turn limit to prevent runaway conversations
|
||||||
if len(context.turns) >= MAX_CONVERSATION_TURNS:
|
if len(context.turns) >= MAX_CONVERSATION_TURNS:
|
||||||
|
logger.debug(f"[FLOW] Thread {thread_id} at max turns ({MAX_CONVERSATION_TURNS})")
|
||||||
return False
|
return False
|
||||||
|
|
||||||
# Create new turn with complete metadata
|
# Create new turn with complete metadata
|
||||||
@@ -277,7 +281,8 @@ def add_turn(
|
|||||||
key = f"thread:{thread_id}"
|
key = f"thread:{thread_id}"
|
||||||
client.setex(key, 3600, context.model_dump_json()) # Refresh TTL to 1 hour
|
client.setex(key, 3600, context.model_dump_json()) # Refresh TTL to 1 hour
|
||||||
return True
|
return True
|
||||||
except Exception:
|
except Exception as e:
|
||||||
|
logger.debug(f"[FLOW] Failed to save turn to Redis: {type(e).__name__}")
|
||||||
return False
|
return False
|
||||||
|
|
||||||
|
|
||||||
@@ -296,23 +301,33 @@ def get_conversation_file_list(context: ThreadContext) -> list[str]:
|
|||||||
list[str]: Deduplicated list of file paths referenced in the conversation
|
list[str]: Deduplicated list of file paths referenced in the conversation
|
||||||
"""
|
"""
|
||||||
if not context.turns:
|
if not context.turns:
|
||||||
|
logger.debug("[FILES] No turns found, returning empty file list")
|
||||||
return []
|
return []
|
||||||
|
|
||||||
# Collect all unique files from all turns, preserving order of first appearance
|
# Collect all unique files from all turns, preserving order of first appearance
|
||||||
seen_files = set()
|
seen_files = set()
|
||||||
unique_files = []
|
unique_files = []
|
||||||
|
|
||||||
for turn in context.turns:
|
logger.debug(f"[FILES] Collecting files from {len(context.turns)} turns")
|
||||||
|
|
||||||
|
for i, turn in enumerate(context.turns):
|
||||||
if turn.files:
|
if turn.files:
|
||||||
|
logger.debug(f"[FILES] Turn {i+1} has {len(turn.files)} files: {turn.files}")
|
||||||
for file_path in turn.files:
|
for file_path in turn.files:
|
||||||
if file_path not in seen_files:
|
if file_path not in seen_files:
|
||||||
seen_files.add(file_path)
|
seen_files.add(file_path)
|
||||||
unique_files.append(file_path)
|
unique_files.append(file_path)
|
||||||
|
logger.debug(f"[FILES] Added new file: {file_path}")
|
||||||
|
else:
|
||||||
|
logger.debug(f"[FILES] Duplicate file skipped: {file_path}")
|
||||||
|
else:
|
||||||
|
logger.debug(f"[FILES] Turn {i+1} has no files")
|
||||||
|
|
||||||
|
logger.debug(f"[FILES] Final unique file list ({len(unique_files)}): {unique_files}")
|
||||||
return unique_files
|
return unique_files
|
||||||
|
|
||||||
|
|
||||||
def build_conversation_history(context: ThreadContext, read_files_func=None) -> str:
|
def build_conversation_history(context: ThreadContext, read_files_func=None) -> tuple[str, int]:
|
||||||
"""
|
"""
|
||||||
Build formatted conversation history for tool prompts with embedded file contents.
|
Build formatted conversation history for tool prompts with embedded file contents.
|
||||||
|
|
||||||
@@ -325,8 +340,8 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
context: ThreadContext containing the complete conversation
|
context: ThreadContext containing the complete conversation
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: Formatted conversation history with embedded files ready for inclusion in prompts
|
tuple[str, int]: (formatted_conversation_history, total_tokens_used)
|
||||||
Empty string if no conversation turns exist
|
Returns ("", 0) if no conversation turns exist
|
||||||
|
|
||||||
Format:
|
Format:
|
||||||
- Header with thread metadata and turn count
|
- Header with thread metadata and turn count
|
||||||
@@ -341,10 +356,11 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
while preventing duplicate file embeddings.
|
while preventing duplicate file embeddings.
|
||||||
"""
|
"""
|
||||||
if not context.turns:
|
if not context.turns:
|
||||||
return ""
|
return "", 0
|
||||||
|
|
||||||
# Get all unique files referenced in this conversation
|
# Get all unique files referenced in this conversation
|
||||||
all_files = get_conversation_file_list(context)
|
all_files = get_conversation_file_list(context)
|
||||||
|
logger.debug(f"[FILES] Found {len(all_files)} unique files in conversation history")
|
||||||
|
|
||||||
history_parts = [
|
history_parts = [
|
||||||
"=== CONVERSATION HISTORY ===",
|
"=== CONVERSATION HISTORY ===",
|
||||||
@@ -356,6 +372,7 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
|
|
||||||
# Embed all files referenced in this conversation once at the start
|
# Embed all files referenced in this conversation once at the start
|
||||||
if all_files:
|
if all_files:
|
||||||
|
logger.debug(f"[FILES] Starting embedding for {len(all_files)} files")
|
||||||
history_parts.extend(
|
history_parts.extend(
|
||||||
[
|
[
|
||||||
"=== FILES REFERENCED IN THIS CONVERSATION ===",
|
"=== FILES REFERENCED IN THIS CONVERSATION ===",
|
||||||
@@ -366,7 +383,7 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
)
|
)
|
||||||
|
|
||||||
# Import required functions
|
# Import required functions
|
||||||
from config import MAX_CONTEXT_TOKENS
|
from config import MAX_CONTENT_TOKENS
|
||||||
|
|
||||||
if read_files_func is None:
|
if read_files_func is None:
|
||||||
from utils.file_utils import read_file_content
|
from utils.file_utils import read_file_content
|
||||||
@@ -379,32 +396,41 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
|
|
||||||
for file_path in all_files:
|
for file_path in all_files:
|
||||||
try:
|
try:
|
||||||
|
logger.debug(f"[FILES] Processing file {file_path}")
|
||||||
# Correctly unpack the tuple returned by read_file_content
|
# Correctly unpack the tuple returned by read_file_content
|
||||||
formatted_content, content_tokens = read_file_content(file_path)
|
formatted_content, content_tokens = read_file_content(file_path)
|
||||||
if formatted_content:
|
if formatted_content:
|
||||||
# read_file_content already returns formatted content, use it directly
|
# read_file_content already returns formatted content, use it directly
|
||||||
# Check if adding this file would exceed the limit
|
# Check if adding this file would exceed the limit
|
||||||
if total_tokens + content_tokens <= MAX_CONTEXT_TOKENS:
|
if total_tokens + content_tokens <= MAX_CONTENT_TOKENS:
|
||||||
file_contents.append(formatted_content)
|
file_contents.append(formatted_content)
|
||||||
total_tokens += content_tokens
|
total_tokens += content_tokens
|
||||||
files_included += 1
|
files_included += 1
|
||||||
logger.debug(
|
logger.debug(
|
||||||
f"📄 File embedded in conversation history: {file_path} ({content_tokens:,} tokens)"
|
f"📄 File embedded in conversation history: {file_path} ({content_tokens:,} tokens)"
|
||||||
)
|
)
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] Successfully embedded {file_path} - {content_tokens:,} tokens (total: {total_tokens:,})"
|
||||||
|
)
|
||||||
else:
|
else:
|
||||||
files_truncated += 1
|
files_truncated += 1
|
||||||
logger.debug(
|
logger.debug(
|
||||||
f"📄 File truncated due to token limit: {file_path} ({content_tokens:,} tokens, would exceed {MAX_CONTEXT_TOKENS:,} limit)"
|
f"📄 File truncated due to token limit: {file_path} ({content_tokens:,} tokens, would exceed {MAX_CONTENT_TOKENS:,} limit)"
|
||||||
|
)
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] File {file_path} would exceed token limit - skipping (would be {total_tokens + content_tokens:,} tokens)"
|
||||||
)
|
)
|
||||||
# Stop processing more files
|
# Stop processing more files
|
||||||
break
|
break
|
||||||
else:
|
else:
|
||||||
logger.debug(f"📄 File skipped (empty content): {file_path}")
|
logger.debug(f"📄 File skipped (empty content): {file_path}")
|
||||||
|
logger.debug(f"[FILES] File {file_path} has empty content - skipping")
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
# Skip files that can't be read but log the failure
|
# Skip files that can't be read but log the failure
|
||||||
logger.warning(
|
logger.warning(
|
||||||
f"📄 Failed to embed file in conversation history: {file_path} - {type(e).__name__}: {e}"
|
f"📄 Failed to embed file in conversation history: {file_path} - {type(e).__name__}: {e}"
|
||||||
)
|
)
|
||||||
|
logger.debug(f"[FILES] Failed to read file {file_path} - {type(e).__name__}: {e}")
|
||||||
continue
|
continue
|
||||||
|
|
||||||
if file_contents:
|
if file_contents:
|
||||||
@@ -417,11 +443,15 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
logger.debug(
|
logger.debug(
|
||||||
f"📄 Conversation history file embedding complete: {files_included} files embedded, {files_truncated} truncated, {total_tokens:,} total tokens"
|
f"📄 Conversation history file embedding complete: {files_included} files embedded, {files_truncated} truncated, {total_tokens:,} total tokens"
|
||||||
)
|
)
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] File embedding summary - {files_included} embedded, {files_truncated} truncated, {total_tokens:,} tokens total"
|
||||||
|
)
|
||||||
else:
|
else:
|
||||||
history_parts.append("(No accessible files found)")
|
history_parts.append("(No accessible files found)")
|
||||||
logger.debug(
|
logger.debug(
|
||||||
f"📄 Conversation history file embedding: no accessible files found from {len(all_files)} requested"
|
f"📄 Conversation history file embedding: no accessible files found from {len(all_files)} requested"
|
||||||
)
|
)
|
||||||
|
logger.debug(f"[FILES] No accessible files found from {len(all_files)} requested files")
|
||||||
else:
|
else:
|
||||||
# Fallback to original read_files function for backward compatibility
|
# Fallback to original read_files function for backward compatibility
|
||||||
files_content = read_files_func(all_files)
|
files_content = read_files_func(all_files)
|
||||||
@@ -434,7 +464,7 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
history_parts.append(files_content)
|
history_parts.append(files_content)
|
||||||
else:
|
else:
|
||||||
# Handle token limit exceeded for conversation files
|
# Handle token limit exceeded for conversation files
|
||||||
error_message = f"ERROR: The total size of files referenced in this conversation has exceeded the context limit and cannot be displayed.\nEstimated tokens: {estimated_tokens}, but limit is {MAX_CONTEXT_TOKENS}."
|
error_message = f"ERROR: The total size of files referenced in this conversation has exceeded the context limit and cannot be displayed.\nEstimated tokens: {estimated_tokens}, but limit is {MAX_CONTENT_TOKENS}."
|
||||||
history_parts.append(error_message)
|
history_parts.append(error_message)
|
||||||
else:
|
else:
|
||||||
history_parts.append("(No accessible files found)")
|
history_parts.append("(No accessible files found)")
|
||||||
@@ -476,7 +506,20 @@ def build_conversation_history(context: ThreadContext, read_files_func=None) ->
|
|||||||
["", "=== END CONVERSATION HISTORY ===", "", "Continue this conversation by building on the previous context."]
|
["", "=== END CONVERSATION HISTORY ===", "", "Continue this conversation by building on the previous context."]
|
||||||
)
|
)
|
||||||
|
|
||||||
return "\n".join(history_parts)
|
# Calculate total tokens for the complete conversation history
|
||||||
|
complete_history = "\n".join(history_parts)
|
||||||
|
from utils.token_utils import estimate_tokens
|
||||||
|
|
||||||
|
total_conversation_tokens = estimate_tokens(complete_history)
|
||||||
|
|
||||||
|
# Summary log of what was built
|
||||||
|
user_turns = len([t for t in context.turns if t.role == "user"])
|
||||||
|
assistant_turns = len([t for t in context.turns if t.role == "assistant"])
|
||||||
|
logger.debug(
|
||||||
|
f"[FLOW] Built conversation history: {user_turns} user + {assistant_turns} assistant turns, {len(all_files)} files, {total_conversation_tokens:,} tokens"
|
||||||
|
)
|
||||||
|
|
||||||
|
return complete_history, total_conversation_tokens
|
||||||
|
|
||||||
|
|
||||||
def _is_valid_uuid(val: str) -> bool:
|
def _is_valid_uuid(val: str) -> bool:
|
||||||
|
|||||||
@@ -422,11 +422,14 @@ def read_file_content(file_path: str, max_size: int = 1_000_000) -> tuple[str, i
|
|||||||
Tuple of (formatted_content, estimated_tokens)
|
Tuple of (formatted_content, estimated_tokens)
|
||||||
Content is wrapped with clear delimiters for AI parsing
|
Content is wrapped with clear delimiters for AI parsing
|
||||||
"""
|
"""
|
||||||
|
logger.debug(f"[FILES] read_file_content called for: {file_path}")
|
||||||
try:
|
try:
|
||||||
# Validate path security before any file operations
|
# Validate path security before any file operations
|
||||||
path = resolve_and_validate_path(file_path)
|
path = resolve_and_validate_path(file_path)
|
||||||
|
logger.debug(f"[FILES] Path validated and resolved: {path}")
|
||||||
except (ValueError, PermissionError) as e:
|
except (ValueError, PermissionError) as e:
|
||||||
# Return error in a format that provides context to the AI
|
# Return error in a format that provides context to the AI
|
||||||
|
logger.debug(f"[FILES] Path validation failed for {file_path}: {type(e).__name__}: {e}")
|
||||||
error_msg = str(e)
|
error_msg = str(e)
|
||||||
# Add Docker-specific help if we're in Docker and path is inaccessible
|
# Add Docker-specific help if we're in Docker and path is inaccessible
|
||||||
if WORKSPACE_ROOT and CONTAINER_WORKSPACE.exists():
|
if WORKSPACE_ROOT and CONTAINER_WORKSPACE.exists():
|
||||||
@@ -438,37 +441,54 @@ def read_file_content(file_path: str, max_size: int = 1_000_000) -> tuple[str, i
|
|||||||
f"To access files in a different directory, please run Claude from that directory."
|
f"To access files in a different directory, please run Claude from that directory."
|
||||||
)
|
)
|
||||||
content = f"\n--- ERROR ACCESSING FILE: {file_path} ---\nError: {error_msg}\n--- END FILE ---\n"
|
content = f"\n--- ERROR ACCESSING FILE: {file_path} ---\nError: {error_msg}\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
tokens = estimate_tokens(content)
|
||||||
|
logger.debug(f"[FILES] Returning error content for {file_path}: {tokens} tokens")
|
||||||
|
return content, tokens
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# Validate file existence and type
|
# Validate file existence and type
|
||||||
if not path.exists():
|
if not path.exists():
|
||||||
|
logger.debug(f"[FILES] File does not exist: {file_path}")
|
||||||
content = f"\n--- FILE NOT FOUND: {file_path} ---\nError: File does not exist\n--- END FILE ---\n"
|
content = f"\n--- FILE NOT FOUND: {file_path} ---\nError: File does not exist\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
|
|
||||||
if not path.is_file():
|
if not path.is_file():
|
||||||
|
logger.debug(f"[FILES] Path is not a file: {file_path}")
|
||||||
content = f"\n--- NOT A FILE: {file_path} ---\nError: Path is not a file\n--- END FILE ---\n"
|
content = f"\n--- NOT A FILE: {file_path} ---\nError: Path is not a file\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
|
|
||||||
# Check file size to prevent memory exhaustion
|
# Check file size to prevent memory exhaustion
|
||||||
file_size = path.stat().st_size
|
file_size = path.stat().st_size
|
||||||
|
logger.debug(f"[FILES] File size for {file_path}: {file_size:,} bytes")
|
||||||
if file_size > max_size:
|
if file_size > max_size:
|
||||||
|
logger.debug(f"[FILES] File too large: {file_path} ({file_size:,} > {max_size:,} bytes)")
|
||||||
content = f"\n--- FILE TOO LARGE: {file_path} ---\nFile size: {file_size:,} bytes (max: {max_size:,})\n--- END FILE ---\n"
|
content = f"\n--- FILE TOO LARGE: {file_path} ---\nFile size: {file_size:,} bytes (max: {max_size:,})\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
return content, estimate_tokens(content)
|
||||||
|
|
||||||
# Read the file with UTF-8 encoding, replacing invalid characters
|
# Read the file with UTF-8 encoding, replacing invalid characters
|
||||||
# This ensures we can handle files with mixed encodings
|
# This ensures we can handle files with mixed encodings
|
||||||
|
logger.debug(f"[FILES] Reading file content for {file_path}")
|
||||||
with open(path, encoding="utf-8", errors="replace") as f:
|
with open(path, encoding="utf-8", errors="replace") as f:
|
||||||
file_content = f.read()
|
file_content = f.read()
|
||||||
|
|
||||||
|
logger.debug(f"[FILES] Successfully read {len(file_content)} characters from {file_path}")
|
||||||
|
|
||||||
# Format with clear delimiters that help the AI understand file boundaries
|
# Format with clear delimiters that help the AI understand file boundaries
|
||||||
# Using consistent markers makes it easier for the model to parse
|
# Using consistent markers makes it easier for the model to parse
|
||||||
|
# NOTE: These markers ("--- BEGIN FILE: ... ---") are distinct from git diff markers
|
||||||
|
# ("--- BEGIN DIFF: ... ---") to allow AI to distinguish between complete file content
|
||||||
|
# vs. partial diff content when files appear in both sections
|
||||||
formatted = f"\n--- BEGIN FILE: {file_path} ---\n{file_content}\n--- END FILE: {file_path} ---\n"
|
formatted = f"\n--- BEGIN FILE: {file_path} ---\n{file_content}\n--- END FILE: {file_path} ---\n"
|
||||||
return formatted, estimate_tokens(formatted)
|
tokens = estimate_tokens(formatted)
|
||||||
|
logger.debug(f"[FILES] Formatted content for {file_path}: {len(formatted)} chars, {tokens} tokens")
|
||||||
|
return formatted, tokens
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
logger.debug(f"[FILES] Exception reading file {file_path}: {type(e).__name__}: {e}")
|
||||||
content = f"\n--- ERROR READING FILE: {file_path} ---\nError: {str(e)}\n--- END FILE ---\n"
|
content = f"\n--- ERROR READING FILE: {file_path} ---\nError: {str(e)}\n--- END FILE ---\n"
|
||||||
return content, estimate_tokens(content)
|
tokens = estimate_tokens(content)
|
||||||
|
logger.debug(f"[FILES] Returning error content for {file_path}: {tokens} tokens")
|
||||||
|
return content, tokens
|
||||||
|
|
||||||
|
|
||||||
def read_files(
|
def read_files(
|
||||||
@@ -497,6 +517,11 @@ def read_files(
|
|||||||
if max_tokens is None:
|
if max_tokens is None:
|
||||||
max_tokens = MAX_CONTEXT_TOKENS
|
max_tokens = MAX_CONTEXT_TOKENS
|
||||||
|
|
||||||
|
logger.debug(f"[FILES] read_files called with {len(file_paths)} paths")
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] Token budget: max={max_tokens:,}, reserve={reserve_tokens:,}, available={max_tokens - reserve_tokens:,}"
|
||||||
|
)
|
||||||
|
|
||||||
content_parts = []
|
content_parts = []
|
||||||
total_tokens = 0
|
total_tokens = 0
|
||||||
available_tokens = max_tokens - reserve_tokens
|
available_tokens = max_tokens - reserve_tokens
|
||||||
@@ -517,31 +542,42 @@ def read_files(
|
|||||||
# Priority 2: Process file paths
|
# Priority 2: Process file paths
|
||||||
if file_paths:
|
if file_paths:
|
||||||
# Expand directories to get all individual files
|
# Expand directories to get all individual files
|
||||||
|
logger.debug(f"[FILES] Expanding {len(file_paths)} file paths")
|
||||||
all_files = expand_paths(file_paths)
|
all_files = expand_paths(file_paths)
|
||||||
|
logger.debug(f"[FILES] After expansion: {len(all_files)} individual files")
|
||||||
|
|
||||||
if not all_files and file_paths:
|
if not all_files and file_paths:
|
||||||
# No files found but paths were provided
|
# No files found but paths were provided
|
||||||
|
logger.debug("[FILES] No files found from provided paths")
|
||||||
content_parts.append(f"\n--- NO FILES FOUND ---\nProvided paths: {', '.join(file_paths)}\n--- END ---\n")
|
content_parts.append(f"\n--- NO FILES FOUND ---\nProvided paths: {', '.join(file_paths)}\n--- END ---\n")
|
||||||
else:
|
else:
|
||||||
# Read files sequentially until token limit is reached
|
# Read files sequentially until token limit is reached
|
||||||
for file_path in all_files:
|
logger.debug(f"[FILES] Reading {len(all_files)} files with token budget {available_tokens:,}")
|
||||||
|
for i, file_path in enumerate(all_files):
|
||||||
if total_tokens >= available_tokens:
|
if total_tokens >= available_tokens:
|
||||||
files_skipped.append(file_path)
|
logger.debug(f"[FILES] Token budget exhausted, skipping remaining {len(all_files) - i} files")
|
||||||
continue
|
files_skipped.extend(all_files[i:])
|
||||||
|
break
|
||||||
|
|
||||||
file_content, file_tokens = read_file_content(file_path)
|
file_content, file_tokens = read_file_content(file_path)
|
||||||
|
logger.debug(f"[FILES] File {file_path}: {file_tokens:,} tokens")
|
||||||
|
|
||||||
# Check if adding this file would exceed limit
|
# Check if adding this file would exceed limit
|
||||||
if total_tokens + file_tokens <= available_tokens:
|
if total_tokens + file_tokens <= available_tokens:
|
||||||
content_parts.append(file_content)
|
content_parts.append(file_content)
|
||||||
total_tokens += file_tokens
|
total_tokens += file_tokens
|
||||||
|
logger.debug(f"[FILES] Added file {file_path}, total tokens: {total_tokens:,}")
|
||||||
else:
|
else:
|
||||||
# File too large for remaining budget
|
# File too large for remaining budget
|
||||||
|
logger.debug(
|
||||||
|
f"[FILES] File {file_path} too large for remaining budget ({file_tokens:,} tokens, {available_tokens - total_tokens:,} remaining)"
|
||||||
|
)
|
||||||
files_skipped.append(file_path)
|
files_skipped.append(file_path)
|
||||||
|
|
||||||
# Add informative note about skipped files to help users understand
|
# Add informative note about skipped files to help users understand
|
||||||
# what was omitted and why
|
# what was omitted and why
|
||||||
if files_skipped:
|
if files_skipped:
|
||||||
|
logger.debug(f"[FILES] {len(files_skipped)} files skipped due to token limits")
|
||||||
skip_note = "\n\n--- SKIPPED FILES (TOKEN LIMIT) ---\n"
|
skip_note = "\n\n--- SKIPPED FILES (TOKEN LIMIT) ---\n"
|
||||||
skip_note += f"Total skipped: {len(files_skipped)}\n"
|
skip_note += f"Total skipped: {len(files_skipped)}\n"
|
||||||
# Show first 10 skipped files as examples
|
# Show first 10 skipped files as examples
|
||||||
@@ -552,4 +588,6 @@ def read_files(
|
|||||||
skip_note += "--- END SKIPPED FILES ---\n"
|
skip_note += "--- END SKIPPED FILES ---\n"
|
||||||
content_parts.append(skip_note)
|
content_parts.append(skip_note)
|
||||||
|
|
||||||
return "\n\n".join(content_parts) if content_parts else ""
|
result = "\n\n".join(content_parts) if content_parts else ""
|
||||||
|
logger.debug(f"[FILES] read_files complete: {len(result)} chars, {total_tokens:,} tokens used")
|
||||||
|
return result
|
||||||
|
|||||||
Reference in New Issue
Block a user