Commit Graph

465 Commits

Author SHA1 Message Date
Fahad
5f69ad4049 Updated instructions. 2025-06-16 19:22:29 +04:00
Fahad
b528598360 Add regression tests for Gemini parameter order bug
Adds two comprehensive tests to prevent future regression of the parameter
order bug in `restriction_service.is_allowed()` calls:

1. `test_gemini_parameter_order_regression_protection` - Tests edge case
   where only alias is allowed, ensuring correct parameter order
2. `test_gemini_parameter_order_edge_case_full_name_only` - Tests reverse
   scenario where only full model name is allowed

These tests specifically catch the subtle bug where parameters were
incorrectly passed as (provider, user_input, resolved_name) instead of
(provider, resolved_name, user_input). The bug was masked by OR logic
in most cases but could cause issues in edge scenarios.

All 498 tests pass, including the new regression protection tests.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 19:13:47 +04:00
Ming
f55f2b0a0f Fix Google model restriction parameter order regression (#62)
- Fixed swapped parameters in restriction_service.is_allowed() calls
- Parameter order should be (provider_type, model_name, original_name)
- Regression introduced in merge commit 39c50a1, breaking Gemini model access
- Added comments to prevent future parameter order confusion
- Resolves Gemini model is not allowed by restriction policy errors

🤖 Generated with Claude Code

Co-authored-by: Ming <ming@mail.ooo>
Co-authored-by: Claude <noreply@anthropic.com>
2025-06-16 19:12:16 +04:00
Fahad
70b64adff3 Schema now lists all models including locally available models
New tool to list all models `listmodels`
Integration test to for all the different combinations of API keys
Tweaks to codereview prompt for a better quality input from Claude
Fixed missing 'low' severity in codereview
2025-06-16 19:07:35 +04:00
Fahad
cb17582d8f Optimize OpenRouter registry loading with class-level caching
Instead of creating new OpenRouterModelRegistry instances multiple times
per tool (4x per tool during schema generation), we now use a shared
class-level cache in BaseTool. This reduces registry loading from 40+ times
to just once during MCP server initialization.

The optimization:
- Adds _openrouter_registry_cache as a class variable in BaseTool
- Implements _get_openrouter_registry() classmethod for lazy loading
- Ensures cache is shared across all tool subclasses
- Maintains identical functionality with improved performance

This significantly reduces startup time and resource usage when OpenRouter
is configured, especially noticeable with many custom models.

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 18:54:15 +04:00
Fahad
93399b6d10 Merge remote-tracking branch 'origin/main' 2025-06-16 18:20:17 +04:00
Beehive Innovations
c3d44525b6 Merge pull request #60 from ming86/fix/google-allowed-models-restriction
WIP: Fix GOOGLE_ALLOWED_MODELS shorthand restriction validation
2025-06-16 18:20:08 +04:00
Fahad
357452b7ba Prompt support 2025-06-16 18:10:21 +04:00
Fahad
ebfda1862e Retry a few times with progressive delays before giving up 2025-06-16 17:47:42 +04:00
Ming
39c50a1e93 Merge branch 'BeehiveInnovations:main' into fix/google-allowed-models-restriction 2025-06-16 21:17:19 +08:00
Ming
4a95197846 Fix remaining validate_model_name parameter order inconsistency
Address code review feedback from Gemini Code Assist bot:
- Fix parameter order in validate_model_name method (line 256)
- Ensure consistent use of original model name for restriction validation
- All is_allowed() calls now properly use (provider, original_name, resolved_name)

This completes the fix for GOOGLE_ALLOWED_MODELS shorthand restriction validation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 21:16:01 +08:00
Ming
3ba22d8336 Fix GOOGLE_ALLOWED_MODELS shorthand restriction validation
- Fixed parameter order in is_allowed() calls to check original model name first
- Fixed validate_parameters() to use original model name instead of resolved name
- Fixed thinking capabilities check to use original model name
- Enables GOOGLE_ALLOWED_MODELS=pro,flash to work correctly with shorthand names

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 21:02:24 +08:00
Beehive Innovations
6b09f1468f Merge pull request #55 from BeehiveInnovations/feature/images
feat: Add comprehensive vision support, GPT-4.1 integration, and enhanced chat prompts
2025-06-16 16:58:28 +04:00
Fahad
65c3840f7e Fix image support integration tests to use real provider resolution pattern
Following the established testing patterns from other tool tests:
- Removed mocking of providers and capabilities
- Use real provider resolution with dummy API keys
- Expect proper validation behavior or provider-not-found errors
- Applied proper Redis mocking for conversation memory tests
- Simplified validation tests to focus on core functionality
- All 473 tests now pass 100% including 13 image support tests

This ensures CI/CD compatibility and follows the proven testing approach
used throughout the codebase for tool integration testing.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 16:37:34 +04:00
Fahad
ed386375be Complete Redis mocking fixes for image support integration tests
- Properly mock Redis client operations to support add_turn functionality
- Set up initial thread contexts so add_turn can find existing threads
- Mock Redis set operations to return success
- Ensure all Redis-dependent tests use proper mock patterns
- All 473 unit tests now pass 100% with proper isolation

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 16:26:23 +04:00
Fahad
a65c63c8da Fix Redis mocking in image support integration tests
- Add proper Redis client mocking to prevent connection attempts during CI
- Apply @patch("utils.conversation_memory.get_redis_client") decorators to all methods using Redis
- Mock thread contexts for get_thread calls to ensure tests work without Redis
- Fixes GitHub Actions failures: ConnectionRefusedError when connecting to localhost:6379
- Maintains test isolation and proper mock patterns used throughout test suite

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 16:20:14 +04:00
Beehive Innovations
3049c85e3c Update base.py
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-16 14:51:34 +04:00
Beehive Innovations
d7982b55f8 Update advanced-usage.md
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-16 14:50:13 +04:00
Beehive Innovations
ff063cf247 Update CLAUDE.md
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-16 14:50:00 +04:00
Fahad
0143140c34 Fix line length violations and code quality improvements
- Fixed worst flake8 violations (300-600+ character lines) in tools directory
- Applied consistent multi-line string formatting for better readability
- Removed incompatible test files from main branch merge
- All 473 tests passing, all code quality checks pass

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 13:21:17 +04:00
Fahad
061fb8691d Merge main into feature/images - resolve conflicts favoring our approach
- Kept version 4.8.0 for new features
- Preserved our _is_builtin_custom_models_config approach over main's ALLOWED_INTERNAL_PATHS
- Our targeted solution is cleaner than the general whitelist approach
2025-06-16 13:19:08 +04:00
Fahad
97fa6781cf Vision support via images / pdfs etc that can be passed on to other models as part of analysis, additional context etc.
Image processing pipeline added
OpenAI GPT-4.1 support
Chat tool prompt enhancement
Lint and code quality improvements
2025-06-16 13:14:53 +04:00
Fahad
d6d7bf8cac Fixed internal file path translation into docker 2025-06-16 11:30:02 +04:00
Fahad
d498e9854b Updated readme with an amazing new discovery 2025-06-16 10:35:39 +04:00
Fahad
8307e32541 Updated readme with an amazing new discovery
Lint
2025-06-16 10:28:39 +04:00
Fahad
8bbadc6505 Updated readme with an amazing new discovery 2025-06-16 10:25:17 +04:00
Fahad
8e2b53b90d Updated readme with an amazing new discovery
Improved prompt
2025-06-16 09:55:40 +04:00
Beehive Innovations
eebe67170d Merge pull request #51 from BeehiveInnovations/improve/file-loading
Fix file prioritization and improve test quality
2025-06-16 07:19:07 +04:00
Fahad
0b94dd8cdd Lint 2025-06-16 07:18:45 +04:00
Fahad
4c0bd3b86d Improved documentation for conversation / file collection strategy, context budget allocation etc 2025-06-16 07:17:35 +04:00
Fahad
5a49d196c8 More integration tests 2025-06-16 07:07:38 +04:00
Fahad
35f37fb92e Fixed integration test for auto mode 2025-06-16 07:00:27 +04:00
Fahad
c643970ffb Fixed integration test for auto mode 2025-06-16 06:57:06 +04:00
Fahad
903aabd311 Fixed imports and lint 2025-06-16 06:24:33 +04:00
Fahad
b43b30b49d Fixed regex 2025-06-16 06:22:10 +04:00
Fahad
e183e1bfff Refactor log monitor to eliminate code duplication
Addressed Gemini code review feedback by refactoring repetitive log processing:

 **Added _process_log_stream helper function**:
- Encapsulates common pattern of reading, filtering, formatting, and printing log lines
- Takes tailer, filter_func, and format_func as parameters
- Eliminates repetitive timestamp and formatting logic

 **Simplified main monitoring loop**:
- Reduced from ~35 lines of repetitive code to 4 clean function calls
- Each log stream now uses: _process_log_stream(tailer, filter, formatter)
- Eliminated duplicate timestamp creation (reduced from 4 to 1 occurrence)

 **Improved maintainability**:
- Changes to log processing logic now only need to be made in one place
- Cleaner, more readable main loop
- Better separation of concerns

 **Verified functionality**:
- All containers rebuild and start successfully
- Log monitor functions correctly with refactored code
- No functional changes, only code organization improvements

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 06:19:53 +04:00
Fahad
be157ab771 Remove unused create_line_handler function
Fixed code quality issues identified by Gemini code review:
- Removed dead code: create_line_handler function was defined but never used
- Eliminated unused parameter warning
- Cleaned up unnecessary complexity in log_monitor.py
- The monitor_mcp_activity function implements all needed logic inline

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 06:16:21 +04:00
Fahad
c9798325c7 Extra logging 2025-06-16 06:09:58 +04:00
Fahad
805e8d6d01 Fix remaining TestGenRequest reference in format_response method
Fixed NameError that was causing Docker container crashes:
- Updated type annotation in format_response method from TestGenRequest to TestGenerationRequest
- This was the last missing reference from the class rename

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 06:08:58 +04:00
Fahad
2cfe0b163a Fix all failing tests and pytest collection warnings
Fixed MagicMock comparison errors across multiple test suites by:
- Adding proper ModelCapabilities mocks with real values instead of MagicMock objects
- Updating test_auto_mode.py with correct provider mocking for model availability tests
- Updating test_thinking_modes.py with proper capabilities mocking in all thinking mode tests
- Updating test_tools.py with proper capabilities mocking for CodeReview and Analyze tools
- Fixing test_large_prompt_handling.py by adding proper provider mocking to prevent errors before large prompt detection

Fixed pytest collection warnings by:
- Renaming TestGenRequest to TestGenerationRequest to avoid pytest collecting it as a test class
- Renaming TestGenTool to TestGenerationTool to avoid pytest collecting it as a test class
- Updated all imports and references across server.py, tools/__init__.py, and test files

All 459 tests now pass without warnings or MagicMock comparison errors.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-16 06:02:12 +04:00
Fahad
8c3efd5676 Bump 2025-06-16 05:52:17 +04:00
Fahad
91077e3810 Performance improvements when embedding files:
- Exit early at MCP boundary if files won't fit within given context of chosen model
- Encourage claude to re-run with better context
- Check file sizes before embedding
- Drop files from older conversations when building continuations and give priority to newer files
- List and mention excluded files to Claude on return
- Improved tests
- Improved precommit prompt
- Added a new Low severity to precommit
- Improved documentation of file embedding strategy
- Refactor
2025-06-16 05:51:52 +04:00
Fahad
56333cbd86 Fixed numbering 2025-06-15 19:23:45 +04:00
Fahad
1d070e43fd Improved 2025-06-15 19:23:08 +04:00
Fahad
d8f4eb99f5 Improved 2025-06-15 19:21:57 +04:00
Fahad
978b6ef155 Improved 2025-06-15 19:20:15 +04:00
Fahad
0bb54d721a Moved API editing instructions above run-server 2025-06-15 19:16:01 +04:00
Fahad
7ef99fb9ce Improved schema description for precommit 2025-06-15 19:11:37 +04:00
Fahad
6dcf095c3d Improved schema description to allow Claude to pre-think harder before invoking thinkdeep 2025-06-15 19:05:23 +04:00
Fahad
ad6cff4498 Lint, bump 2025-06-15 18:43:37 +04:00