🚀 Major Enhancement: Workflow-Based Tool Architecture v5.5.0 (#95)

* WIP: new workflow architecture * WIP: further improvements and cleanup * WIP: cleanup and docks, replace old tool with new * WIP: cleanup and docks, replace old tool with new * WIP: new planner implementation using workflow * WIP: precommit tool working as a workflow instead of a basic tool Support for passing False to use_assistant_model to skip external models completely and use Claude only * WIP: precommit workflow version swapped with old * WIP: codereview * WIP: replaced codereview * WIP: replaced codereview * WIP: replaced refactor * WIP: workflow for thinkdeep * WIP: ensure files get embedded correctly * WIP: thinkdeep replaced with workflow version * WIP: improved messaging when an external model's response is received * WIP: analyze tool swapped * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: fixed get_completion_next_steps_message missing param * Fixed tests Request for files consistently * Fixed tests Request for files consistently * Fixed tests * New testgen workflow tool Updated docs * Swap testgen workflow * Fix CI test failures by excluding API-dependent tests - Update GitHub Actions workflow to exclude simulation tests that require API keys - Fix collaboration tests to properly mock workflow tool expert analysis calls - Update test assertions to handle new workflow tool response format - Ensure unit tests run without external API dependencies in CI 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * WIP - Update tests to match new tools * WIP - Update tests to match new tools --------- Co-authored-by: Claude <noreply@anthropic.com>
2025-06-21 00:08:11 +04:00
parent 4dae6e457e
commit 69a3121452
76 changed files with 17111 additions and 7725 deletions
--- a/docs/tools/testgen.md
+++ b/docs/tools/testgen.md
@@ -1,13 +1,32 @@
 # TestGen Tool - Comprehensive Test Generation

-**Generates thorough test suites with edge case coverage based on existing code and test framework used**
+**Generates thorough test suites with edge case coverage through workflow-driven investigation**

-The `testgen` tool creates comprehensive test suites by analyzing your code paths, understanding intricate dependencies, and identifying realistic edge cases and failure scenarios that need test coverage.
+The `testgen` tool creates comprehensive test suites by analyzing your code paths, understanding intricate dependencies, and identifying realistic edge cases and failure scenarios that need test coverage. This workflow tool guides Claude through systematic investigation of code functionality, critical paths, edge cases, and integration points across multiple steps before generating comprehensive tests with realistic failure mode analysis.

 ## Thinking Mode

 **Default is `medium` (8,192 tokens) for extended thinking models.** Use `high` for complex systems with many interactions or `max` for critical systems requiring exhaustive test coverage.

+## How the Workflow Works
+
+The testgen tool implements a **structured workflow** for comprehensive test generation:
+
+**Investigation Phase (Claude-Led):**
+1. **Step 1**: Claude describes the test generation plan and begins analyzing code functionality
+2. **Step 2+**: Claude examines critical paths, edge cases, error handling, and integration points
+3. **Throughout**: Claude tracks findings, test scenarios, and coverage gaps
+4. **Completion**: Once investigation is thorough, Claude signals completion
+
+**Test Generation Phase:**
+After Claude completes the investigation:
+- Complete test scenario catalog with all edge cases
+- Framework-specific test generation
+- Realistic failure mode coverage
+- Final test suite with comprehensive coverage
+
+This workflow ensures methodical analysis before test generation, resulting in more thorough and valuable test suites.
+
 ## Model Recommendation

 Test generation excels with extended reasoning models like Gemini Pro or O3, which can analyze complex code paths, understand intricate dependencies, and identify comprehensive edge cases. The combination of large context windows and advanced reasoning enables generation of thorough test suites that cover realistic failure scenarios and integration points that shorter-context models might overlook.
@@ -37,11 +56,24 @@ Test generation excels with extended reasoning models like Gemini Pro or O3, whi

 ## Tool Parameters

- `files`: Code files or directories to generate tests for (required, absolute paths)
+**Workflow Investigation Parameters (used during step-by-step process):**
+- `step`: Current investigation step description (required for each step)
+- `step_number`: Current step number in test generation sequence (required)
+- `total_steps`: Estimated total investigation steps (adjustable)
+- `next_step_required`: Whether another investigation step is needed
+- `findings`: Discoveries about functionality and test scenarios (required)
+- `files_checked`: All files examined during investigation
+- `relevant_files`: Files directly needing tests (required in step 1)
+- `relevant_context`: Methods/functions/classes requiring test coverage
+- `confidence`: Confidence level in test plan completeness (exploring/low/medium/high/certain)
+- `backtrack_from_step`: Step number to backtrack from (for revisions)
+
+**Initial Configuration (used in step 1):**
 - `prompt`: Description of what to test, testing objectives, and specific scope/focus areas (required)
 - `model`: auto|pro|flash|o3|o3-mini|o4-mini|o4-mini-high|gpt4.1 (default: server default)
 - `test_examples`: Optional existing test files or directories to use as style/pattern reference (absolute paths)
 - `thinking_mode`: minimal|low|medium|high|max (default: medium, Gemini only)
+- `use_assistant_model`: Whether to use expert test generation phase (default: true, set to false to use Claude only)

 ## Usage Examples