Commit Graph

31 Commits

Author SHA1 Message Date
Fahad
3e27319e60 fix: reduced token usage, removed parameters from schema that CLIs never seem to use 2025-10-22 13:31:08 +04:00
Fahad
8759edc817 fix: consensus now advertises a short list of models to avoid the CLI getting the names wrong 2025-10-03 22:41:28 +04:00
Fahad
d090d5a88e fix: continuation_id warning when using consensus 2025-10-02 22:16:23 +04:00
Fahad
6cab9e56fc feat: added intelligence_score to the model capabilities schema; a 1-20 number that can be specified to influence the sort order of models presented to the CLI in auto selection mode
fix: model definition re-introduced into the schema but intelligently and only a summary is generated per tool. Required to ensure CLI calls and uses the correct model
fix: removed `model` param from some tools where this wasn't needed
fix: fixed adherence to `*_ALLOWED_MODELS` by advertising only the allowed models to the CLI
fix: removed duplicates across providers when passing canonical names back to the CLI; the first enabled provider wins
2025-10-02 21:43:44 +04:00
Fahad
cc8a4dfd21 Overall savings should now be 50%+ tokens used
perf: tweaks to schema descriptions, aiming to reduce token usage without performance degradation
2025-10-01 22:39:12 +04:00
Beehive Innovations
77caef6f54 Merge pull request #260 from DragonFSKY/fix/consensus-model-context-issue
fix: resolve consensus tool model_context parameter missing issue
2025-10-01 19:27:47 +04:00
Fahad
cff6d8998f fix: removed use_websearch; this parameter was confusing Codex. It started using this to prompt the external model to perform searches! web-search is enabled by Claude / Codex etc by default and the external agent can ask claude to search on its behalf. 2025-10-01 18:44:11 +04:00
谢栋梁
0760b31f8a style: fix trailing whitespace in consensus.py
Remove trailing whitespace to pass CI formatting checks
2025-09-03 11:10:02 +08:00
谢栋梁
30a8952fbc refactor: optimize ModelContext creation in consensus tool
Address code review feedback by creating ModelContext instance once
at the beginning of _consult_model method instead of creating it twice.

- Move ModelContext import to method beginning for better practice
- Create single ModelContext instance and reuse for both file processing
  and temperature validation
- Remove redundant ModelContext creation on line 558
- Improve code clarity and efficiency as suggested by code review
2025-09-03 11:03:04 +08:00
谢栋梁
9044b63809 fix: resolve consensus tool model_context parameter missing issue
Fixed runtime bug where _prepare_file_content_for_prompt was called
without required model_context parameter, causing RuntimeError when
processing requests with relevant_files.

- Create ModelContext instance with model_name in _consult_model method
- Pass model_context parameter to _prepare_file_content_for_prompt call
- Add comprehensive regression test to prevent future occurrences
- Maintain consensus tool's blinded design with independent model contexts
2025-09-03 10:55:22 +08:00
Sven Lito
6bd9d6709a fix: address test failures and PR feedback
- Fix ModelContext constructor call in consensus tool (remove invalid parameters)
- Refactor temperature pattern matching for better readability per code review
- All tests now passing (799/799 passed)
2025-08-23 18:50:49 +07:00
Sven Lito
3b4fd88d7e fix: resolve temperature handling issues for O3/custom models (#245)
- Fix consensus tool hardcoded temperature=0.2 bypassing model capabilities
- Add intelligent temperature inference for unknown custom models
- Support multi-model collaboration (O3, Gemini, Claude, Mistral, DeepSeek)
- Only OpenAI O-series and DeepSeek reasoner models reject temperature
- Most reasoning models (Gemini Pro, Claude, Mistral) DO support temperature
- Comprehensive logging for temperature decisions and user guidance

Resolves: https://github.com/BeehiveInnovations/zen-mcp-server/issues/245
2025-08-23 18:43:51 +07:00
Fahad
4b202f5d1d feat: refactored and tweaked model descriptions / schema to use fewer tokens at launch (average reduction per field description: 60-80%) without sacrificing tool effectiveness
Disabled secondary tools by default (for new installations), updated README.md with instructions on how to enable these in .env
run-server.sh now displays disabled / enabled tools (when DISABLED_TOOLS is set)
2025-08-22 09:23:59 +04:00
Fahad
6921616db3 WIP: tool description / schema updates 2025-08-22 06:53:05 +04:00
Fahad
57200a8a2e Precommit updated to always perform external analysis (via _other_ model) unless specified not to. This prevents Claude from being overconfident and inadequately performing subpar precommit checks.
Improved precommit continuations to be immediate
Workflow state restoration added between stateless calls
Fixed incorrect token limit check
2025-08-20 15:19:01 +04:00
Fahad
e29deb23db Improvements to consensus 2025-08-08 12:59:41 +05:00
Fahad
2fa2d5a408 Fixed contamination in consensus https://github.com/BeehiveInnovations/zen-mcp-server/issues/162
Fixed broken test
2025-08-08 10:04:40 +05:00
Fahad
26169ae827 Disable auto mode for consensus 2025-06-28 22:40:19 +04:00
Fahad
bc447d4bcd Generic naming to work with Gemini CLI / Claude Code 2025-06-27 23:41:20 +04:00
Beehive Innovations
7f6a37a7b9 Merge pull request #131 from GiGiDKR/feat-local_support_with_UTF-8_encoding-update
feat: local support with utf 8 encoding
2025-06-27 08:02:14 -07:00
Fahad
090931d7cf Fixed linebreaks
Cleanup
Pass excluded fields to the schema builder directly
2025-06-27 14:29:10 +04:00
OhMyApps
f8e559ebb2 style: format code for consistency and readability across multiple files 2025-06-23 23:17:56 +02:00
OhMyApps
e9c5662b3a feat: Add LOCAL variable support for responses with UTF-8 JSON encoding.
Description: This feature adds support for UTF-8 encoding in JSON responses, allowing for proper handling of special characters and emojis.

- Implement unit tests for UTF-8 encoding in various model providers including Gemini, OpenAI, and OpenAI Compatible.
- Validate UTF-8 support in token counting, content generation, and error handling.
- Introduce tests for JSON serialization ensuring proper handling of French characters and emojis.
- Create tests for language instruction generation based on locale settings.
- Validate UTF-8 handling in workflow tools including AnalyzeTool, CodereviewTool, and DebugIssueTool.
- Ensure that all tests check for correct UTF-8 character preservation and proper JSON formatting.
- Add integration tests to verify the interaction between locale settings and model responses.
2025-06-22 19:13:02 +02:00
Fahad
521c6c0e61 Improved consensus to treat a step properly as both a request + response, and initial step includes Claude's assessment.
Improved prompt to not request for code when it's a general business decision
2025-06-22 13:37:32 +04:00
Fahad
18f6f16ac6 Improved consensus to treat a step properly as both a request + response, and initial step includes Claude's assessment.
Improved prompt to not request for code when it's a general business decision
2025-06-22 13:21:09 +04:00
Fahad
355331d141 Exclude 'model' parameter for consensus as it uses its own 2025-06-22 12:22:04 +04:00
Beehive Innovations
c960bcb720 Add DocGen tool with comprehensive documentation generation capabilities (#109)
* WIP: new workflow architecture

* WIP: further improvements and cleanup

* WIP: cleanup and docks, replace old tool with new

* WIP: cleanup and docks, replace old tool with new

* WIP: new planner implementation using workflow

* WIP: precommit tool working as a workflow instead of a basic tool
Support for passing False to use_assistant_model to skip external models completely and use Claude only

* WIP: precommit workflow version swapped with old

* WIP: codereview

* WIP: replaced codereview

* WIP: replaced codereview

* WIP: replaced refactor

* WIP: workflow for thinkdeep

* WIP: ensure files get embedded correctly

* WIP: thinkdeep replaced with workflow version

* WIP: improved messaging when an external model's response is received

* WIP: analyze tool swapped

* WIP: updated tests
* Extract only the content when building history
* Use "relevant_files" for workflow tools only

* WIP: updated tests
* Extract only the content when building history
* Use "relevant_files" for workflow tools only

* WIP: fixed get_completion_next_steps_message missing param

* Fixed tests
Request for files consistently

* Fixed tests
Request for files consistently

* Fixed tests

* New testgen workflow tool
Updated docs

* Swap testgen workflow

* Fix CI test failures by excluding API-dependent tests

- Update GitHub Actions workflow to exclude simulation tests that require API keys
- Fix collaboration tests to properly mock workflow tool expert analysis calls
- Update test assertions to handle new workflow tool response format
- Ensure unit tests run without external API dependencies in CI

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* WIP - Update tests to match new tools

* WIP - Update tests to match new tools

* WIP - Update tests to match new tools

* Should help with https://github.com/BeehiveInnovations/zen-mcp-server/issues/97
Clear python cache when running script: https://github.com/BeehiveInnovations/zen-mcp-server/issues/96
Improved retry error logging
Cleanup

* WIP - chat tool using new architecture and improved code sharing

* Removed todo

* Removed todo

* Cleanup old name

* Tweak wordings

* Tweak wordings
Migrate old tests

* Support for Flash 2.0 and Flash Lite 2.0

* Support for Flash 2.0 and Flash Lite 2.0

* Support for Flash 2.0 and Flash Lite 2.0
Fixed test

* Improved consensus to use the workflow base class

* Improved consensus to use the workflow base class

* Allow images

* Allow images

* Replaced old consensus tool

* Cleanup tests

* Tests for prompt size

* New tool: docgen
Tests for prompt size
Fixes: https://github.com/BeehiveInnovations/zen-mcp-server/issues/107
Use available token size limits: https://github.com/BeehiveInnovations/zen-mcp-server/issues/105

* Improved docgen prompt
Exclude TestGen from pytest inclusion

* Updated errors

* Lint

* DocGen instructed not to fix bugs, surface them and stick to d

* WIP

* Stop claude from being lazy and only documenting a small handful

* More style rules

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-06-22 10:21:19 +04:00
Fahad
b56993a42f Refactor 2025-06-18 06:24:24 +04:00
Fahad
c3276595f9 Updated instructions 2025-06-18 05:58:50 +04:00
Fahad
be7d80d7aa Advertise prompts, fixes https://github.com/BeehiveInnovations/zen-mcp-server/issues/63 2025-06-17 11:17:19 +04:00
Beehive Innovations
95556ba9ea Add Consensus Tool for Multi-Model Perspective Gathering (#67)
* WIP
Refactor resolving mode_names, should be done once at MCP call boundary
Pass around model context instead
Consensus tool allows one to get a consensus from multiple models, optionally assigning one a 'for' or 'against' stance to find nuanced responses.

* Deduplication of model resolution, model_context should be available before reaching deeper parts of the code
Improved abstraction when building conversations
Throw programmer errors early

* Guardrails
Support for `model:option` format at MCP boundary so future tools can use additional options if needed instead of handling this only for consensus
Model name now supports an optional ":option" for future use

* Simplified async flow

* Improved model for request to support natural language
Simplified async flow

* Improved model for request to support natural language
Simplified async flow

* Fix consensus tool async/sync patterns to match codebase standards

CRITICAL FIXES:
- Converted _get_consensus_responses from async to sync (matches other tools)
- Converted store_conversation_turn from async to sync (add_turn is synchronous)
- Removed unnecessary asyncio imports and sleep calls
- Fixed ClosedResourceError in MCP protocol during long consensus operations

PATTERN ALIGNMENT:
- Consensus tool now follows same sync patterns as all other tools
- Only execute() and prepare_prompt() are async (base class requirement)
- All internal operations are synchronous like analyze, chat, debug, etc.

TESTING:
- MCP simulation test now passes: consensus_stance 
- Two-model consensus works correctly in ~35 seconds
- Unknown stance handling defaults to neutral with warnings
- All 9 unit tests pass (100% success rate)

The consensus tool async patterns were anomalous in the codebase.
This fix aligns it with the established synchronous patterns used
by all other tools while maintaining full functionality.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fixed call order and added new test

* Cleanup dead comments
Docs for the new tool
Improved tests

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-06-17 10:53:17 +04:00