Commit Graph

28 Commits

Author SHA1 Message Date
Fahad
ba08308a23 fix(security): handle macOS symlinked system dirs
Follow-up on PR #353 to keep dangerous-path blocking correct on macOS (/etc -> /private/etc) while avoiding overblocking Windows workspaces (C:\).
2025-12-15 17:02:24 +00:00
Fahad
b2dc84992d fix: rebranding, see [docs/name-change.md](docs/name-change.md) for details 2025-12-04 18:15:14 +04:00
Bjorn Melin
f713d8a354 feat: enhance model support by adding GPT-5.1 to .gitignore and updating cassette maintenance documentation for dual-model testing 2025-11-14 01:40:49 -07:00
Fahad
7c36b9255a refactor: moved registries into a separate module and code cleanup
fix: refactored dial provider to follow the same pattern
2025-10-07 12:59:09 +04:00
Fahad
2c534ac06e feat: centralized environment handling, ensures ZEN_MCP_FORCE_ENV_OVERRIDE is honored correctly
fix: updated tests to override env variables they need instead of relying on the current values from .env
2025-10-04 14:28:56 +04:00
Fahad
d55130a430 fix: external model name now recorded properly in responses
test: http cassettes added for improved integration tests
refactor: generic name for the CLI agent
2025-10-03 11:29:06 +04:00
Fahad
182aa627df refactor: code cleanup 2025-10-02 08:09:44 +04:00
Josh Vera
6fa7cbcf0d fix: Ensure dummy API keys are set for tests with no_mock_provider marker
The test failures in CI were caused by tests with @pytest.mark.no_mock_provider
that prevented dummy API keys from being set. In CI with no real API keys,
this led to 'Model not available' errors.

Changed pytest_collection_modifyitems to always set dummy keys if missing,
regardless of markers. This ensures tests work in CI while still allowing
real API keys to be used when present.

Fixes test_conversation_field_mapping.py failures in CI across Python 3.10-3.12.
2025-07-13 11:29:02 -06:00
Josh Vera
3db49413ff fix: Resolve o3-pro response parsing and test execution issues
- Fix lint errors: trailing whitespace and deprecated typing imports
- Update test mock for o3-pro response format (output.content[] → output_text)
- Implement robust test isolation with monkeypatch fixture
- Clear provider registry cache to prevent test interference
- Ensure o3-pro tests pass in both individual and full suite execution

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-12 20:24:34 -06:00
Josh Vera
7f92085c70 feat: Fix o3-pro response parsing and implement HTTP transport recorder
- Fix o3-pro response parsing to use output_text convenience field
- Replace respx with custom httpx transport solution for better reliability
- Implement comprehensive PII sanitization to prevent secret exposure
- Add HTTP request/response recording with cassette format for testing
- Sanitize all existing cassettes to remove exposed API keys
- Update documentation to reflect new HTTP transport recorder
- Add test suite for PII sanitization and HTTP recording

This change:
1. Fixes timeout issues with o3-pro API calls (was 2+ minutes, now ~15-22 seconds)
2. Properly captures response content without httpx.ResponseNotRead exceptions
3. Preserves original HTTP response format including gzip compression
4. Prevents future secret exposure with automatic PII sanitization
5. Enables reliable replay testing for o3-pro interactions

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-12 18:47:17 -06:00
Beehive Innovations
c960bcb720 Add DocGen tool with comprehensive documentation generation capabilities (#109)
* WIP: new workflow architecture

* WIP: further improvements and cleanup

* WIP: cleanup and docks, replace old tool with new

* WIP: cleanup and docks, replace old tool with new

* WIP: new planner implementation using workflow

* WIP: precommit tool working as a workflow instead of a basic tool
Support for passing False to use_assistant_model to skip external models completely and use Claude only

* WIP: precommit workflow version swapped with old

* WIP: codereview

* WIP: replaced codereview

* WIP: replaced codereview

* WIP: replaced refactor

* WIP: workflow for thinkdeep

* WIP: ensure files get embedded correctly

* WIP: thinkdeep replaced with workflow version

* WIP: improved messaging when an external model's response is received

* WIP: analyze tool swapped

* WIP: updated tests
* Extract only the content when building history
* Use "relevant_files" for workflow tools only

* WIP: updated tests
* Extract only the content when building history
* Use "relevant_files" for workflow tools only

* WIP: fixed get_completion_next_steps_message missing param

* Fixed tests
Request for files consistently

* Fixed tests
Request for files consistently

* Fixed tests

* New testgen workflow tool
Updated docs

* Swap testgen workflow

* Fix CI test failures by excluding API-dependent tests

- Update GitHub Actions workflow to exclude simulation tests that require API keys
- Fix collaboration tests to properly mock workflow tool expert analysis calls
- Update test assertions to handle new workflow tool response format
- Ensure unit tests run without external API dependencies in CI

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* WIP - Update tests to match new tools

* WIP - Update tests to match new tools

* WIP - Update tests to match new tools

* Should help with https://github.com/BeehiveInnovations/zen-mcp-server/issues/97
Clear python cache when running script: https://github.com/BeehiveInnovations/zen-mcp-server/issues/96
Improved retry error logging
Cleanup

* WIP - chat tool using new architecture and improved code sharing

* Removed todo

* Removed todo

* Cleanup old name

* Tweak wordings

* Tweak wordings
Migrate old tests

* Support for Flash 2.0 and Flash Lite 2.0

* Support for Flash 2.0 and Flash Lite 2.0

* Support for Flash 2.0 and Flash Lite 2.0
Fixed test

* Improved consensus to use the workflow base class

* Improved consensus to use the workflow base class

* Allow images

* Allow images

* Replaced old consensus tool

* Cleanup tests

* Tests for prompt size

* New tool: docgen
Tests for prompt size
Fixes: https://github.com/BeehiveInnovations/zen-mcp-server/issues/107
Use available token size limits: https://github.com/BeehiveInnovations/zen-mcp-server/issues/105

* Improved docgen prompt
Exclude TestGen from pytest inclusion

* Updated errors

* Lint

* DocGen instructed not to fix bugs, surface them and stick to d

* WIP

* Stop claude from being lazy and only documenting a small handful

* More style rules

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-06-22 10:21:19 +04:00
Fahad
4394ca1061 Updated docs for the new debug tool 2025-06-19 13:29:42 +04:00
Fahad
d0da6ce9e4 Gemini model rename 2025-06-19 05:37:40 +04:00
Beehive Innovations
4151c3c3a5 Migration from Docker to Standalone Python Server (#73)
* Migration from docker to standalone server
Migration handling
Fixed tests
Use simpler in-memory storage
Support for concurrent logging to disk
Simplified direct connections to localhost

* Migration from docker / redis to standalone script
Updated tests
Updated run script
Fixed requirements
Use dotenv
Ask if user would like to install MCP in Claude Desktop once
Updated docs

* More cleanup and references to docker removed

* Cleanup

* Comments

* Fixed tests

* Fix GitHub Actions workflow for standalone Python architecture

- Install requirements-dev.txt for pytest and testing dependencies
- Remove Docker setup from simulation tests (now standalone)
- Simplify linting job to use requirements-dev.txt
- Update simulation tests to run directly without Docker

Fixes unit test failures in CI due to missing pytest dependency.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Remove simulation tests from GitHub Actions

- Removed simulation-tests job that makes real API calls
- Keep only unit tests (mocked, no API costs) and linting
- Simulation tests should be run manually with real API keys
- Reduces CI costs and complexity

GitHub Actions now only runs:
- Unit tests (569 tests, all mocked)
- Code quality checks (ruff, black)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fixed tests

* Fixed tests

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-06-18 23:41:22 +04:00
Fahad
6304b7af6b Native support for xAI Grok3
Model shorthand mapping related fixes
Comprehensive auto-mode related tests
2025-06-15 12:21:44 +04:00
Fahad
746380eb7f Renamed setup script to avoid confusion (https://github.com/BeehiveInnovations/zen-mcp-server/issues/35)
Further fixes to tests
Pass O3 simulation test when keys are not set, along with a notice
Updated docs on testing, simulation tests / contributing
Support for OpenAI o4-mini and o4-mini-high
2025-06-14 09:28:20 +04:00
Fahad
c5f682c7b0 Fix tests to work with effective auto mode changes
- Added autouse fixture to mock provider availability in tests
- Updated test expectations to match new auto mode behavior
- Fixed mock provider capabilities to return proper values
- Updated claude continuation tests to set default model
- All 256 tests now passing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-14 02:43:29 +04:00
Fahad
26b22a1d53 Simplified /workspace to map to a project scoped WORKSPACE_ROOT 2025-06-13 20:49:37 +04:00
Fahad
3aedb16101 Use the new Gemini 2.5 Flash
Updated to support Thinking Tokens as a ratio of the max allowed
Updated tests
Updated README
2025-06-12 20:46:54 +04:00
Fahad
354a0fae0b Fixed tests 2025-06-12 13:51:22 +04:00
Fahad
79af2654b9 Use the new flash model
Updated tests
2025-06-12 13:44:09 +04:00
Fahad
fb66825bf6 Rebranding, refactoring, renaming, cleanup, updated docs 2025-06-12 10:40:43 +04:00
Fahad
2a067a7f4e WIP major refactor and features 2025-06-12 07:14:59 +04:00
Fahad
854dbe16cf fix: resolve E402 linting errors in conftest.py
- Move all imports to top of file including asyncio, tempfile, and pytest
- Maintain functionality while following proper import order
- All linting checks now pass
2025-06-09 14:02:13 +04:00
Fahad
7ee610938b feat: add review_pending_changes tool and enforce absolute path security
- Add new review_pending_changes tool for comprehensive pre-commit reviews
- Implement filesystem sandboxing with MCP_PROJECT_ROOT
- Enforce absolute paths for all file/directory operations
- Add comprehensive git utilities for repository management
- Update all tools to use centralized path validation
- Add extensive test coverage for new features and security model
- Update documentation with new tool and path requirements
- Remove obsolete demo and guide files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-09 12:42:40 +04:00
Fahad
c03af1629f fix: apply isort formatting to fix CI linting
Applied isort to properly sort all imports according to PEP8:
- Standard library imports first
- Third-party imports second
- Local imports last
- Alphabetical ordering within each group

All tests still passing after import reordering.
2025-06-08 22:32:27 +04:00
Fahad
b6f8c43b81 refactor: remove emojis and apply code formatting
- Remove all emoji characters from output strings for better compatibility
- Update all tests to match non-emoji output
- Apply black formatting to all Python files
- Ensure all tests pass and linting succeeds
- Remove htmlcov directory (already in .gitignore)

This change improves cross-platform compatibility and ensures
consistent code formatting across the project.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-08 21:32:38 +04:00
Fahad
970b73c175 docs: Improve Quick Start instructions for better clarity
- Add explicit step to clone the repository first
- Include instructions for accessing config via Claude Desktop UI
- Clarify that users must replace placeholder paths with actual paths
- Add note to remember the cloned directory path

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-08 20:39:27 +04:00