* Support for Custom URLs and custom models, including locally hosted models such as ollama
* Support for native + openrouter + local models (i.e. dozens of models) means you can start delegating sub-tasks to particular models or work to local models such as localizations or other boring work etc.
* Several tests added
* precommit to also include untracked (new) files
* Logfile auto rollover
* Improved logging
Simulation tests to confirm threading and history traversal
Chain of communication and branching validation tests from live simulation
Temperature enforcement per model
## Major Features Added
### 🎯 Dynamic Configuration System
- **Environment-aware model selection**: DEFAULT_MODEL with 'pro'/'flash' shortcuts
- **Configurable thinking modes**: DEFAULT_THINKING_MODE_THINKDEEP for extended reasoning
- **All tool schemas now dynamic**: Show actual current defaults instead of hardcoded values
- **Enhanced setup workflow**: Copy from .env.example with smart customization
### 🔧 Model & Thinking Configuration
- **Smart model resolution**: Support both shortcuts ('pro', 'flash') and full model names
- **Thinking mode optimization**: Only apply thinking budget to models that support it
- **Flash model compatibility**: Works without thinking config, still beneficial via system prompts
- **Dynamic schema descriptions**: Tool parameters show current environment values
### 🚀 Enhanced Developer Experience
- **Fail-fast Docker setup**: GEMINI_API_KEY required upfront in docker-compose
- **Comprehensive startup logging**: Shows current model and thinking mode defaults
- **Enhanced get_version tool**: Reports all dynamic configuration values
- **Better .env documentation**: Clear token consumption details and model options
### 🧪 Comprehensive Testing
- **Live model validation**: New simulator test validates Pro vs Flash thinking behavior
- **Dynamic configuration tests**: Verify environment variable overrides work correctly
- **Complete test coverage**: All 139 unit tests pass, including new model config tests
### 📋 Configuration Files Updated
- **docker-compose.yml**: Fail-fast API key validation, thinking mode support
- **setup-docker.sh**: Copy from .env.example instead of manual creation
- **.env.example**: Detailed documentation with token consumption per thinking mode
- **.gitignore**: Added test-setup/ for cleanup
### 🛠 Technical Improvements
- **Removed setup.py**: Fully Docker-based deployment (no longer needed)
- **REDIS_URL smart defaults**: Auto-configured for Docker, still configurable for dev
- **All tools updated**: Consistent dynamic model parameter descriptions
- **Enhanced error handling**: Better model resolution and validation
## Breaking Changes
- Removed setup.py (Docker-only deployment)
- Model parameter descriptions now show actual defaults (dynamic)
## Migration Guide
- Update .env files using new .env.example format
- Use 'pro'/'flash' shortcuts or full model names
- Set DEFAULT_THINKING_MODE_THINKDEEP for custom thinking depth
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
This major refactoring addresses critical bugs in conversation history management
and significantly improves token efficiency through intelligent file embedding:
**Key Improvements:**
• Fixed conversation history duplication bug by centralizing reconstruction in server.py
• Added intelligent file filtering to prevent re-embedding files already in conversation history
• Centralized file processing logic in BaseTool._prepare_file_content_for_prompt()
• Enhanced log monitoring with better categorization and file embedding visibility
• Updated comprehensive test suite to verify new architecture and edge cases
**Architecture Changes:**
• Removed duplicate conversation history reconstruction from tools/base.py
• Conversation history now handled exclusively by server.py:reconstruct_thread_context
• All tools now use centralized file processing with automatic deduplication
• Improved token efficiency by embedding unique files only once per conversation
**Performance Benefits:**
• Reduced token usage through smart file filtering
• Eliminated redundant file embeddings in continued conversations
• Better observability with detailed debug logging for file operations
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
This commit addresses several critical issues and improvements:
🔧 Critical Fixes:
- Fixed conversation history not being included when using continuation_id in AI-to-AI conversations
- Fixed test mock targeting issues preventing proper conversation memory validation
- Fixed Docker debug logging functionality with Gemini tools
🐛 Bug Fixes:
- Docker compose configuration for proper container command execution
- Test mock import targeting from utils.conversation_memory.* to tools.base.*
- Version bump to 3.1.0 reflecting significant improvements
🚀 Improvements:
- Enhanced Docker environment configuration with comprehensive logging setup
- Added cross-tool continuation documentation and examples in README
- Improved error handling and validation across all tools
- Better logging configuration with LOG_LEVEL environment variable support
- Enhanced conversation memory system documentation
🧪 Testing:
- Added comprehensive conversation history bug fix tests
- Added cross-tool continuation functionality tests
- All 132 tests now pass with proper conversation history validation
- Improved test coverage for AI-to-AI conversation threading
✨ Code Quality:
- Applied black, isort, and ruff formatting across entire codebase
- Enhanced inline documentation for conversation memory system
- Cleaned up temporary files and improved repository hygiene
- Better test descriptions and coverage for critical functionality
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Renamed `think_deeper` tool to `thinkdeep` for shorter, cleaner naming
- Updated all imports from ThinkDeeperTool to ThinkDeepTool
- Updated all references from THINK_DEEPER_PROMPT to THINKDEEP_PROMPT
- Updated tool registration in server.py
- Updated all test files to use new naming convention
- Updated README documentation to reflect new tool names
- All functionality remains the same, only naming has changed
This completes the tool renaming refactor for improved clarity and consistency.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Renamed `review_code` tool to `codereview` for better naming convention
- Renamed `review_changes` tool to `precommit` to better reflect its purpose
- Updated all tool descriptions to remove "Triggers:" sections and improve clarity
- Updated all imports and references throughout the codebase
- Renamed test files to match new tool names
- Updated server.py tool registrations
- All existing functionality preserved with improved naming
This refactoring improves code organization and makes tool purposes clearer.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Rename debug_issue.py to debug.py
- Update tool name from 'debug_issue' to 'debug' throughout codebase
- Update all references in server.py, tests, and README
- Keep DebugIssueTool class name for backward compatibility
- All tests pass with the renamed tool
This makes the tool name shorter and more consistent with other
tool names like 'chat' and 'analyze'. The functionality remains
exactly the same.
🤖 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
Added check in translate_path_for_environment() to detect and skip
already-translated container paths (those starting with /workspace).
This prevents the function from attempting to translate paths like:
- /workspace/src/main.py -> /inaccessible/outside/mounted/volume/workspace/src/main.py
Now it correctly handles:
- Host path: /Users/.../src/main.py -> /workspace/src/main.py (translation)
- Container path: /workspace/src/main.py -> /workspace/src/main.py (no change)
Added comprehensive test to verify double-translation prevention.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fixed review_changes tool to properly translate host paths to container paths in Docker
- Prevents "No such file or directory" errors when running in Docker containers
- Added proper error handling with clear messages when paths are inaccessible
refactor: Centralized token limit validation across all tools
- Added _validate_token_limit method to BaseTool to eliminate code duplication
- Reduced ~25 lines of duplicated code across 5 tools (analyze, chat, debug_issue, review_code, think_deeper)
- Maintains exact same error messages and behavior
feat: Enhanced large prompt handling
- Added support for prompts >50K chars by requesting file-based input
- Preserves MCP's ~25K token capacity for responses
- All tools now check prompt size before processing
test: Added comprehensive Docker path integration tests
- Tests for path translation, security validation, and error handling
- Tests for review_changes tool specifically with Docker paths
- Fixed failing think_deeper test (updated default from "max" to "high")
chore: Code quality improvements
- Applied black formatting across all files
- Fixed import sorting with isort
- All tests passing (96 tests)
- Standardized error handling follows MCP TextContent format
The changes ensure consistent behavior across all environments while reducing code duplication and improving maintainability.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add instruction for Gemini to request files when needed
- Add comprehensive tests for files parameter functionality
- Test file request instruction presence/absence based on context
- Run all tests, ruff, and black formatting
Now review_changes can both accept context files and allow Gemini
to request additional files during review for better validation.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major changes:
- Add comprehensive documentation to all modules with detailed docstrings
- Remove unused THINKING_MODEL config (use single GEMINI_MODEL with thinking_mode param)
- Remove list_models functionality (simplified to single model configuration)
- Rename DEFAULT_MODEL to GEMINI_MODEL for clarity
- Remove unused python-dotenv dependency
- Fix missing pydantic in setup.py dependencies
Documentation improvements:
- Document security measures in file_utils.py (path validation, sandboxing)
- Add detailed comments to critical logic sections
- Document tool creation process in BaseTool
- Explain configuration values and their impact
- Add comprehensive function-level documentation
Code quality:
- Apply black formatting to all files
- Fix all ruff linting issues
- Update tests to match refactored code
- All 63 tests passing
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Renamed tool from review_pending_changes to review_changes for brevity
- Enhanced tool descriptions for better MCP auto-discovery
- Updated all references throughout codebase including:
- Tool implementation (tools/review_changes.py)
- Test files (tests/test_review_changes.py)
- Server registration and imports
- Documentation in README.md
- Tool prompts in prompts/tool_prompts.py
- Enhanced review_changes description to emphasize pre-commit usage
- All tests pass, linting and formatting checks pass
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Remove unused TOOL_TRIGGERS dictionary from config.py
- Remove associated test_tool_triggers test case
- TOOL_TRIGGERS was not used anywhere in the codebase
- MCP automatically discovers tools through descriptions in list_tools handler
- All tests pass (64 tests), ruff clean, black formatted
- Move all imports to top of file including asyncio, tempfile, and pytest
- Maintain functionality while following proper import order
- All linting checks now pass
- Add new review_pending_changes tool for comprehensive pre-commit reviews
- Implement filesystem sandboxing with MCP_PROJECT_ROOT
- Enforce absolute paths for all file/directory operations
- Add comprehensive git utilities for repository management
- Update all tools to use centralized path validation
- Add extensive test coverage for new features and security model
- Update documentation with new tool and path requirements
- Remove obsolete demo and guide files
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add collaboration demo showing dynamic context requests
- Implement chat tool for general conversations and brainstorming
- Add tool selection guide with clear boundaries
- Introduce models configuration system
- Update prompts for better tool descriptions
- Refactor server to remove redundant functionality
- Add comprehensive tests for collaboration features
- Enhance base tool with collaborative features
This enables Claude to request additional context from Gemini
during tool execution, improving analysis quality and accuracy.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Run black formatter on all Python files
- Fix ruff linting issues:
- Remove unused imports
- Remove unused variables
- Fix f-string without placeholders
- All 37 tests still pass
- Code quality improved for CI/CD compliance
🧹 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>