Commit Graph

179 Commits

Author SHA1 Message Date
Devon Hillard
c29e7623ac fix: Remove duplicate OpenAI models from listmodels output
Fixed issue where OpenAI models appeared twice in listmodels output by:
- Removing self-referencing aliases from OpenAI model definitions (e.g., "gpt-5" no longer includes "gpt-5" in its aliases)
- Adding filter in listmodels.py to skip aliases that match the model name
- Cleaned up inconsistent alias naming (o3-pro -> o3pro)

This ensures each model appears only once in the listing while preserving all useful aliases.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-09 19:00:43 -06:00
Sven Lito
6bd9d6709a fix: address test failures and PR feedback
- Fix ModelContext constructor call in consensus tool (remove invalid parameters)
- Refactor temperature pattern matching for better readability per code review
- All tests now passing (799/799 passed)
2025-08-23 18:50:49 +07:00
Sven Lito
3b4fd88d7e fix: resolve temperature handling issues for O3/custom models (#245)
- Fix consensus tool hardcoded temperature=0.2 bypassing model capabilities
- Add intelligent temperature inference for unknown custom models
- Support multi-model collaboration (O3, Gemini, Claude, Mistral, DeepSeek)
- Only OpenAI O-series and DeepSeek reasoner models reject temperature
- Most reasoning models (Gemini Pro, Claude, Mistral) DO support temperature
- Comprehensive logging for temperature decisions and user guidance

Resolves: https://github.com/BeehiveInnovations/zen-mcp-server/issues/245
2025-08-23 18:43:51 +07:00
Fahad
f89afd1a72 fix: https://github.com/BeehiveInnovations/zen-mcp-server/issues/251 added handling for safety_feedback from Gemini. FinishReason.STOP can be a hidden safety block from gemini or issued when it chooses not to respond. 2025-08-23 14:03:46 +04:00
Fahad
4b202f5d1d feat: refactored and tweaked model descriptions / schema to use fewer tokens at launch (average reduction per field description: 60-80%) without sacrificing tool effectiveness
Disabled secondary tools by default (for new installations), updated README.md with instructions on how to enable these in .env
run-server.sh now displays disabled / enabled tools (when DISABLED_TOOLS is set)
2025-08-22 09:23:59 +04:00
Fahad
6921616db3 WIP: tool description / schema updates 2025-08-22 06:53:05 +04:00
Fahad
80d21e57c0 feat: refactored and improved codereview in line with precommit. Reviews are now either external (default) or internal. Takes away anxiety and loss of tokens when Claude incorrectly decides to be 'confident' about its own changes and bungle things up.
fix: Minor tweaks to prompts
fix: Improved support for smaller models that struggle with strict structured JSON output
Rearranged reasons to use the MCP above quick start (collapsed)
2025-08-21 14:04:32 +04:00
Fahad
d30c212029 refactor: minor prompt tweaks 2025-08-21 12:23:13 +04:00
Fahad
77e8ed1a9f Further improvements to precommit to ensure required steps are followed precisely 2025-08-20 16:08:22 +04:00
Fahad
57200a8a2e Precommit updated to always perform external analysis (via _other_ model) unless specified not to. This prevents Claude from being overconfident and inadequately performing subpar precommit checks.
Improved precommit continuations to be immediate
Workflow state restoration added between stateless calls
Fixed incorrect token limit check
2025-08-20 15:19:01 +04:00
Fahad
0af9202012 Precommit updated to take always prefer external analysis (via _other_ model) unless specified not to. This prevents Claude from being overconfident and inadequately performing subpar precommit checks. 2025-08-20 11:55:40 +04:00
google-labs-jules[bot]
0959d6f0fa feat: Update Claude models to Opus 4.1 and Sonnet 4.1
This commit updates all references to Claude Opus 4 and Sonnet 4 to their newer 4.1 versions throughout the codebase.

The changes include:
- Updating model names in `conf/custom_models.json` and `providers/dial.py`.
- Updating aliases and descriptions to match the new model versions.
- Updating `.env.example` to reflect the new model names.
- Updating all relevant test suites to use the new model names and ensure all tests pass.
2025-08-17 16:08:52 +00:00
Fahad
e29deb23db Improvements to consensus 2025-08-08 12:59:41 +05:00
Fahad
b212cae5de Fixed labeling 2025-08-08 12:35:30 +05:00
Fahad
1c3dea3a08 Fixed labeling 2025-08-08 12:33:41 +05:00
Beehive Innovations
bea021a021 Merge pull request #207 from GiGiDKR/fix/workflow-localization-signature-harmonization
fix: harmonize method signatures and add localization support for workflow tools
2025-08-07 22:35:46 -07:00
Fahad
2fa2d5a408 Fixed contamination in consensus https://github.com/BeehiveInnovations/zen-mcp-server/issues/162
Fixed broken test
2025-08-08 10:04:40 +05:00
Fahad
8203baa4ef Ensure continuation id is passed back 2025-08-08 09:01:44 +05:00
Fahad
1a8ec2e12f GPT-5, GPT-5-mini support
Improvements to model name resolution
Improved instructions for multi-step workflows when continuation is available
Improved instructions for chat tool
Improved preferred model resolution, moved code from registry -> each provider
Updated tests
2025-08-08 08:51:34 +05:00
Fahad
9a4791cb06 Updated description 2025-08-08 05:26:45 +05:00
GiGiDKR
d327c90d82 fix: use precise type hint Optional[dict[str, Any]] for arguments parameter
- Update arguments parameter type hint from Optional[dict] to Optional[dict[str, Any]] in workflow_mixin.py
- Ensures consistency with BaseTool and improves static analysis and code clarity
- No functional changes, only type annotation improvement

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-25 07:00:22 +02:00
OhMyApps
81551f8452 fix: harmonize method signatures and add localization support for workflow tools
- Add LOCALE-based localization support to all workflow tools
- Harmonize method signatures for prepare_prompt and _prepare_file_content_for_prompt
- Remove obsolete methods and clarify comments
- Ensure consistent behavior between SimpleTool and WorkflowTool
2025-07-25 06:48:44 +02:00
Fahad
268df43858 Improved auto-challenge invocation
Automatically determine MCP client's name
2025-06-30 13:31:04 +04:00
Fahad
a54343dc79 Improved challenge prompt and instructions 2025-06-29 17:52:00 +04:00
Fahad
6b495cea0b New tool! "challenge" with confidence and stop Claude from agreeing with you blindly and undoing the _correct_ strategy because you were wrong
Fixed run script to ensure pip is installed
2025-06-29 15:50:45 +04:00
Fahad
df5e8e6793 Merge remote-tracking branch 'origin/main' 2025-06-29 13:01:17 +04:00
OhMyApps
479f556535 Merge branch 'BeehiveInnovations:main' into feat-dockerisation 2025-06-29 02:07:06 +02:00
Fahad
26169ae827 Disable auto mode for consensus 2025-06-28 22:40:19 +04:00
Fahad
b9c2e4f5e6 Tweaks to prompts to prevent Claude from becoming overconfident 2025-06-28 22:30:58 +04:00
Fahad
adbc4af4a9 Update confidence enum values across workflow tools
Added new confidence values (very_high, almost_certain) to all workflow tools
to provide more granular confidence tracking. Updated enum declarations in:
- analyze.py, codereview.py, debug.py, precommit.py, secaudit.py, testgen.py
- Updated debug.py's get_required_actions to handle new confidence values
- All tools now use consistent 7-value confidence scale
- refactor.py kept its unique scale (exploring/incomplete/partial/complete)

Also fixed model thinking configuration:
- Added very_high and almost_certain to MODEL_THINKING_PREFERENCES
- Set medium thinking for very_high, high thinking for almost_certain
- Updated prompts to clarify certain means 100% local confidence

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-28 00:04:35 +04:00
Fahad
bc447d4bcd Generic naming to work with Gemini CLI / Claude Code 2025-06-27 23:41:20 +04:00
OhMyApps
62178aa073 Merge branch 'BeehiveInnovations:main' into feat-dockerisation 2025-06-27 18:45:41 +02:00
Beehive Innovations
7f6a37a7b9 Merge pull request #131 from GiGiDKR/feat-local_support_with_UTF-8_encoding-update
feat: local support with utf 8 encoding
2025-06-27 08:02:14 -07:00
Fahad
090931d7cf Fixed linebreaks
Cleanup
Pass excluded fields to the schema builder directly
2025-06-27 14:29:10 +04:00
Fahad
0237fb3419 Set read-only annotation hints on each tool for security 2025-06-26 13:16:00 +04:00
OhMyApps
453f921df6 Merge branch 'main' into feat-dockerisation 2025-06-25 18:10:26 +02:00
Fahad
3ce0f93e5b Lint 2025-06-25 19:37:25 +04:00
Fahad
6d0bafa81d Support for Gemini CLI (setup instructions) - WIP 2025-06-25 19:36:09 +04:00
OhMyApps
a46f8c2fad feat: add localization tests and improve locale handling in tools 2025-06-23 23:35:02 +02:00
OhMyApps
f8e559ebb2 style: format code for consistency and readability across multiple files 2025-06-23 23:17:56 +02:00
OhMyApps
1fd48f034f Merge branch 'feat-local_support_with_UTF-8_encoding-update' of https://github.com/GiGiDKR/zen-mcp-server into feat-local_support_with_UTF-8_encoding-update 2025-06-23 22:24:47 +02:00
Fahad
498ea88293 Use ModelCapabilities consistently instead of dictionaries
Moved aliases as part of SUPPORTED_MODELS instead of shorthand, more in line with how custom_models are declared
Further refactoring to cleanup some code
2025-06-23 16:58:59 +04:00
OhMyApps
7e5f95531b Merge branch 'BeehiveInnovations:main' into feat-local_support_with_UTF-8_encoding-update 2025-06-23 12:51:56 +02:00
Illya Havsiyevych
0623ce3546 feat: DIAL provider implementation (#112)
## Description

This PR implements a new [DIAL](https://dialx.ai/dial_api) (Data & AI Layer) provider for the Zen MCP Server, enabling unified access to multiple AI models through the DIAL API platform. DIAL provides enterprise-grade AI model access with deployment-specific routing similar to Azure OpenAI.

## Changes Made

- [x] Added support of atexit:
  - Ensures automatic cleanup of provider resources (HTTP clients, connection pools) on server shutdown
  - Fixed bug using ModelProviderRegistry.get_available_providers() instead of accessing private _providers
  - Works with SIGTERM/Ctrl+C for graceful shutdown in both development and containerized environments
- [x] Added new DIAL provider (`providers/dial.py`) inheriting from `OpenAICompatibleProvider`
- [x] Updated server.py to register DIAL provider during initialization
- [x] Updated provider registry to include DIAL provider type
- [x] Implemented deployment-specific routing for DIAL's Azure OpenAI-style endpoints
- [x] Implemented performance optimizations:
  - Connection pooling with httpx for better performance
  - Thread-safe client caching with double-check locking pattern
  - Proper resource cleanup with `close()` method
- [x] Added comprehensive unit tests with 16 test cases (`tests/test_dial_provider.py`)
- [x] Added DIAL configuration to `.env.example` with documentation
- [x] Added support for configurable API version via `DIAL_API_VERSION` environment variable
- [x] Added DIAL model restrictions support via `DIAL_ALLOWED_MODELS` environment variable

### Supported DIAL Models:
- OpenAI models: o3, o4-mini (and their dated versions)
- Google models: gemini-2.5-pro, gemini-2.5-flash (including search variant)
- Anthropic models: Claude 4 Opus/Sonnet (with and without thinking mode)

### Environment Variables:
- `DIAL_API_KEY`: Required API key for DIAL authentication
- `DIAL_API_HOST`: Optional base URL (defaults to https://core.dialx.ai)
- `DIAL_API_VERSION`: Optional API version header (defaults to 2025-01-01-preview)
- `DIAL_ALLOWED_MODELS`: Optional comma-separated list of allowed models

### Breaking Changes:
- None

  ### Dependencies:
  - No new dependencies added (uses existing OpenAI SDK with custom routing)
2025-06-23 14:07:10 +04:00
omryn-vera
4ae0344b14 feat: Update Claude model references from v3 to v4 (fixes issue #118) (#119)
* feat: Update Claude model references from v3 to v4

- Update model configurations from claude-3-opus to claude-4-opus
- Update model configurations from claude-3-sonnet to claude-4-sonnet
- Maintain backward compatibility through existing aliases (opus, sonnet, claude)
- Update provider registry preferred models list
- Update all test cases and assertions to reflect new model names
- Update documentation and examples consistently across all files
- Add Claude 4 model support while preserving existing functionality

Files modified: 15 (config, docs, providers, tests, tools)
Pattern: Systematic claude-3-* → claude-4-* model reference migration

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* PR feedback: changed anthropic/claude-4-opus -> anthropic/claude-opus-4 and anthropic/claude-4-haiku -> anthropic/claude-3.5-haiku

* changed anthropic/claude-4-sonnet -> anthropic/claude-sonnet-4

* PR feedback removed specific model from test mock

* PR feedback removed base.py

---------

Co-authored-by: Omry Nachman <omry@wix.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-06-23 13:57:13 +04:00
OhMyApps
e9c5662b3a feat: Add LOCAL variable support for responses with UTF-8 JSON encoding.
Description: This feature adds support for UTF-8 encoding in JSON responses, allowing for proper handling of special characters and emojis.

- Implement unit tests for UTF-8 encoding in various model providers including Gemini, OpenAI, and OpenAI Compatible.
- Validate UTF-8 support in token counting, content generation, and error handling.
- Introduce tests for JSON serialization ensuring proper handling of French characters and emojis.
- Create tests for language instruction generation based on locale settings.
- Validate UTF-8 handling in workflow tools including AnalyzeTool, CodereviewTool, and DebugIssueTool.
- Ensure that all tests check for correct UTF-8 character preservation and proper JSON formatting.
- Add integration tests to verify the interaction between locale settings and model responses.
2025-06-22 19:13:02 +02:00
Beehive Innovations
000d12dc3a Add secaudit tool for security auditing (#117)
* WIP - working version

* Implement required methods
2025-06-22 15:28:05 +04:00
Fahad
521c6c0e61 Improved consensus to treat a step properly as both a request + response, and initial step includes Claude's assessment.
Improved prompt to not request for code when it's a general business decision
2025-06-22 13:37:32 +04:00
Fahad
18f6f16ac6 Improved consensus to treat a step properly as both a request + response, and initial step includes Claude's assessment.
Improved prompt to not request for code when it's a general business decision
2025-06-22 13:21:09 +04:00
Fahad
355331d141 Exclude 'model' parameter for consensus as it uses its own 2025-06-22 12:22:04 +04:00