Commit Graph

542 Commits

Author SHA1 Message Date
Beehive Innovations
be907a7b29 Merge pull request #39 from NikolaiUgelvik/fix-openrouter-auto-mode
Fix auto mode when only OpenRouter is configured
2025-06-14 21:37:17 +04:00
Nikolai Ugelvik
0eeea3dd67 Apply black formatting to test_openrouter_provider.py 2025-06-14 19:33:20 +02:00
Nikolai Ugelvik
be2612752a Fix auto mode when only OpenRouter is configured
The get_available_models method in ModelProviderRegistry was only checking
for providers with SUPPORTED_MODELS attribute, which OpenRouter doesn't have.
This caused auto mode to fail with "No models available" error when only
OpenRouter API key was configured.

Added special handling for OpenRouter provider to check its _registry
for available models, ensuring auto mode works correctly with OpenRouter.

Added comprehensive tests to verify:
- Auto mode works with only OpenRouter configured
- Model restrictions are respected
- Graceful handling when no providers are available
- No crashes when OpenRouter lacks _registry attribute
2025-06-14 19:21:14 +02:00
Fahad
70f1356e3e Improved trigger words to enforce large prompts are passed in as a file reference 2025-06-14 21:03:17 +04:00
Fahad
4cacd2dad9 Fixed trigger word 2025-06-14 20:14:40 +04:00
Fahad
6b05096ba0 testgen grounding 2025-06-14 20:10:59 +04:00
Fahad
b2489409eb Move o3-pro test into its own 2025-06-14 19:53:33 +04:00
Beehive Innovations
9f973b90e5 Merge pull request #36 from lox/add-o3-pro-support
feat: Add o3-pro model support
2025-06-14 19:44:14 +04:00
Fahad
68a75a7791 Updated lint instructions for PRs 2025-06-14 19:37:53 +04:00
Fahad
a8fd7f3d24 Bump 2025-06-14 19:32:11 +04:00
Fahad
f1ad06c529 Fixed lint, tests after recent fix
Updated readme
2025-06-14 19:31:31 +04:00
Fahad
b41b874e31 Fixed model name mapping for openrouter 2025-06-14 19:19:59 +04:00
Fahad
b405aaf8bd some justifications 2025-06-14 19:01:53 +04:00
Fahad
a4f9e22256 Renamed version tool 2025-06-14 18:54:53 +04:00
Fahad
442decba70 Improved model response handling to handle additional response statuses in future
Improved testgen; encourages follow-ups with less work in between and less token generation to avoid surpassing the 25K barrier
Improved coderevew tool to request a focused code review instead where a single-pass code review is too large or complex
2025-06-14 18:43:56 +04:00
Fahad
ec5fee4409 Bump 2025-06-14 16:48:21 +04:00
Fahad
698c1611d2 Lint fix 2025-06-14 16:47:59 +04:00
Fahad
d0d0a171dc Ensure duplicate file references are gracefully handled
Improved prompt to encourage immediate action
2025-06-14 16:37:02 +04:00
Fahad
ec707e021a Fix for path translation within docker 2025-06-14 16:00:54 +04:00
Fahad
acbfa1c94e Improved prompt for next steps 2025-06-14 15:51:04 +04:00
Fahad
4086306c58 New tool: testgen
Generates unit tests and encourages model to auto-detect framework and testing style from existing sample (if available)
2025-06-14 15:41:47 +04:00
Lachlan Donald
40aa1eaeb6 Format test_auto_mode.py with black
Fix code formatting to comply with black style requirements.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 21:09:47 +10:00
Fahad
7d33aafcab Configurable conversation limit now set to 10 exchanges. This helps when you want to manually continue a thread of thought across different models manually. 2025-06-14 14:00:13 +04:00
Fahad
bc3f98a291 Make conversation timeout configuration (so that you're able to resume a discussion manually with another model with a gap of several hours in case you stepped away) 2025-06-14 13:27:19 +04:00
Beehive Innovations
a569e316af Update README.md 2025-06-14 12:41:15 +04:00
Fahad
710a2ab0eb Surface important public notice 2025-06-14 11:48:57 +04:00
Fahad
7481af5c8f Surface important public notice 2025-06-14 11:47:29 +04:00
Fahad
4a3767921a Surface important public notice 2025-06-14 11:46:32 +04:00
Fahad
002203f0da Surface important public notice 2025-06-14 11:45:45 +04:00
Fahad
b17fe06d27 bump 2025-06-14 11:34:28 +04:00
Fahad
e0a05b86f1 Add encouraging message about powerful models to schema in case it's not on Opus 4 or above
OPENROUTER_ALLOWED_MODELS environment variable support to further limit the models to allow from within Claude. This will put a limit on top of even the ones listed in custom_models.json
2025-06-14 11:34:17 +04:00
Fahad
21037c2d81 Refactored prompts for better maintainability 2025-06-14 11:09:13 +04:00
Fahad
94b2a4d407 Bump 2025-06-14 10:57:08 +04:00
Fahad
23353734cd Support for allowed model restrictions per provider
Tool escalation added to `analyze` to a graceful switch over to codereview is made when absolutely necessary
2025-06-14 10:56:53 +04:00
Lachlan Donald
c12dc1d765 Fix syntax error from incomplete merge conflict resolution
- Remove merge conflict markers from providers/openai.py
- Include o3-pro in temperature constraint check for O3/O4 models

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 15:50:40 +10:00
Lachlan Donald
a3aaf6f79b Enhance o3-pro test coverage with comprehensive codereview testing
- Added o3-pro codereview tests for both direct OpenAI and OpenRouter paths
- Updated validation criteria to account for additional test cases (5 total calls)
- Addresses Gemini Code Assist feedback about incomplete test coverage
- Ensures o3-pro functionality is thoroughly validated across all tools

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 15:49:19 +10:00
Lachlan Donald
69ec38d1af Add o3-pro model support and extend test coverage
- Added o3-pro model configuration to custom_models.json with 200K context
- Updated OpenAI provider to support o3-pro with fixed temperature constraint
- Extended simulator tests to include o3-pro validation scenarios

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 15:49:19 +10:00
Fahad
ac9c58ce61 Use flash for the comprehensive simulation test to run quicker 2025-06-14 09:42:10 +04:00
Fahad
2c805d6637 Fixed mock comparison error 2025-06-14 09:34:56 +04:00
Fahad
746380eb7f Renamed setup script to avoid confusion (https://github.com/BeehiveInnovations/zen-mcp-server/issues/35)
Further fixes to tests
Pass O3 simulation test when keys are not set, along with a notice
Updated docs on testing, simulation tests / contributing
Support for OpenAI o4-mini and o4-mini-high
2025-06-14 09:28:20 +04:00
Fahad
c5f682c7b0 Fix tests to work with effective auto mode changes
- Added autouse fixture to mock provider availability in tests
- Updated test expectations to match new auto mode behavior
- Fixed mock provider capabilities to return proper values
- Updated claude continuation tests to set default model
- All 256 tests now passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 02:43:29 +04:00
Fahad
eb388ab2f2 Categorize tools into 'model capabilities categories' to help determine which type of model to pick when in auto mode
Encourage Claude to pick the best model for the job automatically in auto mode
Lots of new tests to ensure automatic model picking works reliably based on user preference or when a matching model is not found or ambiguous
Improved error reporting when bogus model is requested and is not configured or available
2025-06-14 02:17:06 +04:00
Fahad
7fc1186a7c Fixed for auto mode 2025-06-14 01:16:15 +04:00
Fahad
14c266a162 Cleanup 2025-06-14 00:30:50 +04:00
Fahad
ca606be67e Bump 2025-06-14 00:27:17 +04:00
Fahad
8ac5bbb5af Fixed workspace path mapping
Refactoring
Improved system prompts, more generalized
Home folder protection and detection
Retry logic for gemini
2025-06-14 00:26:59 +04:00
Fahad
26b22a1d53 Simplified /workspace to map to a project scoped WORKSPACE_ROOT 2025-06-13 20:49:37 +04:00
Fahad
ebf5cfaa9e Use debug logging for now by default 2025-06-13 19:34:35 +04:00
Fahad
dda4f4bc7f Updated template 2025-06-13 19:24:16 +04:00
Fahad
8554fa083a Fixed broken doc links 2025-06-13 16:29:05 +04:00