- Fix consensus tool hardcoded temperature=0.2 bypassing model capabilities
- Add intelligent temperature inference for unknown custom models
- Support multi-model collaboration (O3, Gemini, Claude, Mistral, DeepSeek)
- Only OpenAI O-series and DeepSeek reasoner models reject temperature
- Most reasoning models (Gemini Pro, Claude, Mistral) DO support temperature
- Comprehensive logging for temperature decisions and user guidance
Resolves: https://github.com/BeehiveInnovations/zen-mcp-server/issues/245
* feat: Enhance run-server.sh with uv-first Python environment setup
- Implement uv-first approach for faster environment setup when available
- Add automatic WSL environment detection using wslvar for better Windows integration
- Improve system package installation by using bash -c for better command execution
- Add comprehensive environment validation and fallback mechanisms
- Optimize dependency installation with uv when environment supports it
- Enhance configuration display workflow for better user experience
- Add environment markers to track uv-created environments
- Improve error handling and user messaging throughout setup process
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: Add .claude/settings.local.json to .gitignore
Personal Claude Code settings should not be tracked in source control.
Reference: https://docs.anthropic.com/en/docs/claude-code/settings🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* feat: Add shared Claude Code settings.json for team defaults
- Add .claude/settings.json with default permissions for team use
- Remove .claude/settings.local.json from git tracking (now personal config)
Reference: https://docs.anthropic.com/en/docs/claude-code/settings🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Address Gemini Code Assist review issues in run-server.sh
- Fix hardcoded Unix paths (lines 563 & 568) with cross-platform detection
- Improve error handling by capturing uv stderr instead of /dev/null
- Fix uv environment detection logic to check uv_created marker file
Resolves three issues identified in PR review:
1. High Priority: Replace hardcoded $VENV_PATH/bin/python with detection
2. Medium Priority: Capture and display uv command errors for debugging
3. Medium Priority: Use marker file instead of path matching for uv detection
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: Address additional Gemini Code Assist feedback
- Enhance WSL warning message with more helpful guidance
- Add security comment explaining bash -c usage over eval
- Add get_venv_python_path helper function for cleaner cross-platform detection
- Improve path resolution error handling with proper error checking
Addresses 4 additional review points from gemini-code-assist to improve
user experience, code clarity, and error handling robustness.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
* Migration from docker to standalone server
Migration handling
Fixed tests
Use simpler in-memory storage
Support for concurrent logging to disk
Simplified direct connections to localhost
* Migration from docker / redis to standalone script
Updated tests
Updated run script
Fixed requirements
Use dotenv
Ask if user would like to install MCP in Claude Desktop once
Updated docs
* More cleanup and references to docker removed
* Cleanup
* Comments
* Fixed tests
* Fix GitHub Actions workflow for standalone Python architecture
- Install requirements-dev.txt for pytest and testing dependencies
- Remove Docker setup from simulation tests (now standalone)
- Simplify linting job to use requirements-dev.txt
- Update simulation tests to run directly without Docker
Fixes unit test failures in CI due to missing pytest dependency.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove simulation tests from GitHub Actions
- Removed simulation-tests job that makes real API calls
- Keep only unit tests (mocked, no API costs) and linting
- Simulation tests should be run manually with real API keys
- Reduces CI costs and complexity
GitHub Actions now only runs:
- Unit tests (569 tests, all mocked)
- Code quality checks (ruff, black)
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fixed tests
* Fixed tests
---------
Co-authored-by: Claude <noreply@anthropic.com>
Encourage Claude to pick the best model for the job automatically in auto mode
Lots of new tests to ensure automatic model picking works reliably based on user preference or when a matching model is not found or ambiguous
Improved error reporting when bogus model is requested and is not configured or available
Simulation tests to confirm threading and history traversal
Chain of communication and branching validation tests from live simulation
Temperature enforcement per model
## Major Features Added
### 🎯 Dynamic Configuration System
- **Environment-aware model selection**: DEFAULT_MODEL with 'pro'/'flash' shortcuts
- **Configurable thinking modes**: DEFAULT_THINKING_MODE_THINKDEEP for extended reasoning
- **All tool schemas now dynamic**: Show actual current defaults instead of hardcoded values
- **Enhanced setup workflow**: Copy from .env.example with smart customization
### 🔧 Model & Thinking Configuration
- **Smart model resolution**: Support both shortcuts ('pro', 'flash') and full model names
- **Thinking mode optimization**: Only apply thinking budget to models that support it
- **Flash model compatibility**: Works without thinking config, still beneficial via system prompts
- **Dynamic schema descriptions**: Tool parameters show current environment values
### 🚀 Enhanced Developer Experience
- **Fail-fast Docker setup**: GEMINI_API_KEY required upfront in docker-compose
- **Comprehensive startup logging**: Shows current model and thinking mode defaults
- **Enhanced get_version tool**: Reports all dynamic configuration values
- **Better .env documentation**: Clear token consumption details and model options
### 🧪 Comprehensive Testing
- **Live model validation**: New simulator test validates Pro vs Flash thinking behavior
- **Dynamic configuration tests**: Verify environment variable overrides work correctly
- **Complete test coverage**: All 139 unit tests pass, including new model config tests
### 📋 Configuration Files Updated
- **docker-compose.yml**: Fail-fast API key validation, thinking mode support
- **setup-docker.sh**: Copy from .env.example instead of manual creation
- **.env.example**: Detailed documentation with token consumption per thinking mode
- **.gitignore**: Added test-setup/ for cleanup
### 🛠 Technical Improvements
- **Removed setup.py**: Fully Docker-based deployment (no longer needed)
- **REDIS_URL smart defaults**: Auto-configured for Docker, still configurable for dev
- **All tools updated**: Consistent dynamic model parameter descriptions
- **Enhanced error handling**: Better model resolution and validation
## Breaking Changes
- Removed setup.py (Docker-only deployment)
- Model parameter descriptions now show actual defaults (dynamic)
## Migration Guide
- Update .env files using new .env.example format
- Use 'pro'/'flash' shortcuts or full model names
- Set DEFAULT_THINKING_MODE_THINKDEEP for custom thinking depth
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added professional testing infrastructure:
Unit Tests:
- Comprehensive test suite covering all major functionality
- Tests for models, file operations, tool handlers, and error cases
- Async test support for MCP handlers
- Mocking for external API calls
- 84% code coverage achieved
CI/CD Pipeline:
- GitHub Actions workflow for automated testing
- Matrix testing across Python 3.8-3.12
- Cross-platform testing (Ubuntu, macOS, Windows)
- Automated linting with flake8, black, isort, and mypy
- Code coverage reporting with 80% minimum threshold
Configuration:
- pytest.ini with proper test discovery and coverage settings
- .coveragerc for coverage configuration
- Updated .gitignore for test artifacts
- Development dependencies in requirements.txt
Documentation:
- Added testing section to README
- Instructions for running tests locally
- Contributing guidelines with test requirements
This ensures code quality and reliability for all contributions.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- MCP server implementation for Google Gemini models
- Support for multiple Gemini models including 1.5 Pro and 2.5 Pro preview
- Chat tool with configurable parameters (temperature, max_tokens, model)
- List models tool to view available Gemini models
- System prompt support
- Comprehensive error handling for blocked responses
- Test suite included
- Documentation and examples
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>