🚀 Major Enhancement: Workflow-Based Tool Architecture v5.5.0 (#95)
* WIP: new workflow architecture * WIP: further improvements and cleanup * WIP: cleanup and docks, replace old tool with new * WIP: cleanup and docks, replace old tool with new * WIP: new planner implementation using workflow * WIP: precommit tool working as a workflow instead of a basic tool Support for passing False to use_assistant_model to skip external models completely and use Claude only * WIP: precommit workflow version swapped with old * WIP: codereview * WIP: replaced codereview * WIP: replaced codereview * WIP: replaced refactor * WIP: workflow for thinkdeep * WIP: ensure files get embedded correctly * WIP: thinkdeep replaced with workflow version * WIP: improved messaging when an external model's response is received * WIP: analyze tool swapped * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: updated tests * Extract only the content when building history * Use "relevant_files" for workflow tools only * WIP: fixed get_completion_next_steps_message missing param * Fixed tests Request for files consistently * Fixed tests Request for files consistently * Fixed tests * New testgen workflow tool Updated docs * Swap testgen workflow * Fix CI test failures by excluding API-dependent tests - Update GitHub Actions workflow to exclude simulation tests that require API keys - Fix collaboration tests to properly mock workflow tool expert analysis calls - Update test assertions to handle new workflow tool response format - Ensure unit tests run without external API dependencies in CI 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * WIP - Update tests to match new tools * WIP - Update tests to match new tools --------- Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
committed by
GitHub
parent
4dae6e457e
commit
69a3121452
890
tools/testgen.py
890
tools/testgen.py
@@ -1,67 +1,155 @@
|
||||
"""
|
||||
TestGen tool - Comprehensive test suite generation with edge case coverage
|
||||
TestGen Workflow tool - Step-by-step test generation with expert validation
|
||||
|
||||
This tool generates comprehensive test suites by analyzing code paths,
|
||||
identifying edge cases, and producing test scaffolding that follows
|
||||
project conventions when test examples are provided.
|
||||
This tool provides a structured workflow for comprehensive test generation.
|
||||
It guides Claude through systematic investigation steps with forced pauses between each step
|
||||
to ensure thorough code examination, test planning, and pattern identification before proceeding.
|
||||
The tool supports backtracking, finding updates, and expert analysis integration for
|
||||
comprehensive test suite generation.
|
||||
|
||||
Key Features:
|
||||
- Multi-file and directory support
|
||||
- Framework detection from existing tests
|
||||
- Edge case identification (nulls, boundaries, async issues, etc.)
|
||||
- Test pattern following when examples provided
|
||||
- Deterministic test example sampling for large test suites
|
||||
Key features:
|
||||
- Step-by-step test generation workflow with progress tracking
|
||||
- Context-aware file embedding (references during investigation, full content for analysis)
|
||||
- Automatic test pattern detection and framework identification
|
||||
- Expert analysis integration with external models for additional test suggestions
|
||||
- Support for edge case identification and comprehensive coverage
|
||||
- Confidence-based workflow optimization
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
from typing import Any, Optional
|
||||
from typing import TYPE_CHECKING, Any, Optional
|
||||
|
||||
from pydantic import Field
|
||||
from pydantic import Field, model_validator
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from tools.models import ToolModelCategory
|
||||
|
||||
from config import TEMPERATURE_ANALYTICAL
|
||||
from systemprompts import TESTGEN_PROMPT
|
||||
from tools.shared.base_models import WorkflowRequest
|
||||
|
||||
from .base import BaseTool, ToolRequest
|
||||
from .workflow.base import WorkflowTool
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Field descriptions to avoid duplication between Pydantic and JSON schema
|
||||
TESTGEN_FIELD_DESCRIPTIONS = {
|
||||
"files": "Code files or directories to generate tests for (must be FULL absolute paths to real files / folders - DO NOT SHORTEN)",
|
||||
"prompt": "Description of what to test, testing objectives, and specific scope/focus areas. Be specific about any "
|
||||
"particular component, module, class of function you would like to generate tests for.",
|
||||
"test_examples": (
|
||||
"Optional existing test files or directories to use as style/pattern reference (must be FULL absolute paths to real files / folders - DO NOT SHORTEN). "
|
||||
"If not provided, the tool will determine the best testing approach based on the code structure. "
|
||||
"For large test directories, only the smallest representative tests should be included to determine testing patterns. "
|
||||
"If similar tests exist for the code being tested, include those for the most relevant patterns."
|
||||
# Tool-specific field descriptions for test generation workflow
|
||||
TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS = {
|
||||
"step": (
|
||||
"What to analyze or look for in this step. In step 1, describe what you want to test and begin forming an "
|
||||
"analytical approach after thinking carefully about what needs to be examined. Consider code structure, "
|
||||
"business logic, critical paths, edge cases, and potential failure modes. Map out the codebase structure, "
|
||||
"understand the functionality, and identify areas requiring test coverage. In later steps, continue exploring "
|
||||
"with precision and adapt your understanding as you uncover more insights about testable behaviors."
|
||||
),
|
||||
"step_number": (
|
||||
"The index of the current step in the test generation sequence, beginning at 1. Each step should build upon or "
|
||||
"revise the previous one."
|
||||
),
|
||||
"total_steps": (
|
||||
"Your current estimate for how many steps will be needed to complete the test generation analysis. "
|
||||
"Adjust as new findings emerge."
|
||||
),
|
||||
"next_step_required": (
|
||||
"Set to true if you plan to continue the investigation with another step. False means you believe the "
|
||||
"test generation analysis is complete and ready for expert validation."
|
||||
),
|
||||
"findings": (
|
||||
"Summarize everything discovered in this step about the code being tested. Include analysis of functionality, "
|
||||
"critical paths, edge cases, boundary conditions, error handling, async behavior, state management, and "
|
||||
"integration points. Be specific and avoid vague language—document what you now know about the code and "
|
||||
"what test scenarios are needed. IMPORTANT: Document both the happy paths and potential failure modes. "
|
||||
"Identify existing test patterns if examples were provided. In later steps, confirm or update past findings "
|
||||
"with additional evidence."
|
||||
),
|
||||
"files_checked": (
|
||||
"List all files (as absolute paths, do not clip or shrink file names) examined during the test generation "
|
||||
"investigation so far. Include even files ruled out or found to be unrelated, as this tracks your "
|
||||
"exploration path."
|
||||
),
|
||||
"relevant_files": (
|
||||
"Subset of files_checked (as full absolute paths) that contain code directly needing tests or are essential "
|
||||
"for understanding test requirements. Only list those that are directly tied to the functionality being tested. "
|
||||
"This could include implementation files, interfaces, dependencies, or existing test examples."
|
||||
),
|
||||
"relevant_context": (
|
||||
"List methods, functions, classes, or modules that need test coverage, in the format "
|
||||
"'ClassName.methodName', 'functionName', or 'module.ClassName'. Prioritize critical business logic, "
|
||||
"public APIs, complex algorithms, and error-prone code paths."
|
||||
),
|
||||
"confidence": (
|
||||
"Indicate your current confidence in the test generation assessment. Use: 'exploring' (starting analysis), "
|
||||
"'low' (early investigation), 'medium' (some patterns identified), 'high' (strong understanding), 'certain' "
|
||||
"(only when the test plan is thoroughly complete and all test scenarios are identified). Do NOT use 'certain' "
|
||||
"unless the test generation analysis is comprehensively complete, use 'high' instead not 100% sure. Using "
|
||||
"'certain' prevents additional expert analysis."
|
||||
),
|
||||
"backtrack_from_step": (
|
||||
"If an earlier finding or assessment needs to be revised or discarded, specify the step number from which to "
|
||||
"start over. Use this to acknowledge investigative dead ends and correct the course."
|
||||
),
|
||||
"images": (
|
||||
"Optional list of absolute paths to architecture diagrams, flow charts, or visual documentation that help "
|
||||
"understand the code structure and test requirements. Only include if they materially assist test planning."
|
||||
),
|
||||
}
|
||||
|
||||
|
||||
class TestGenerationRequest(ToolRequest):
|
||||
"""
|
||||
Request model for the test generation tool.
|
||||
class TestGenRequest(WorkflowRequest):
|
||||
"""Request model for test generation workflow investigation steps"""
|
||||
|
||||
This model defines all parameters that can be used to customize
|
||||
the test generation process, from selecting code files to providing
|
||||
test examples for style consistency.
|
||||
# Required fields for each investigation step
|
||||
step: str = Field(..., description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["step"])
|
||||
step_number: int = Field(..., description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["step_number"])
|
||||
total_steps: int = Field(..., description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["total_steps"])
|
||||
next_step_required: bool = Field(..., description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["next_step_required"])
|
||||
|
||||
# Investigation tracking fields
|
||||
findings: str = Field(..., description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["findings"])
|
||||
files_checked: list[str] = Field(
|
||||
default_factory=list, description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["files_checked"]
|
||||
)
|
||||
relevant_files: list[str] = Field(
|
||||
default_factory=list, description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["relevant_files"]
|
||||
)
|
||||
relevant_context: list[str] = Field(
|
||||
default_factory=list, description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["relevant_context"]
|
||||
)
|
||||
confidence: Optional[str] = Field("low", description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["confidence"])
|
||||
|
||||
# Optional backtracking field
|
||||
backtrack_from_step: Optional[int] = Field(
|
||||
None, description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["backtrack_from_step"]
|
||||
)
|
||||
|
||||
# Optional images for visual context
|
||||
images: Optional[list[str]] = Field(default=None, description=TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["images"])
|
||||
|
||||
# Override inherited fields to exclude them from schema (except model which needs to be available)
|
||||
temperature: Optional[float] = Field(default=None, exclude=True)
|
||||
thinking_mode: Optional[str] = Field(default=None, exclude=True)
|
||||
use_websearch: Optional[bool] = Field(default=None, exclude=True)
|
||||
|
||||
@model_validator(mode="after")
|
||||
def validate_step_one_requirements(self):
|
||||
"""Ensure step 1 has required relevant_files field."""
|
||||
if self.step_number == 1 and not self.relevant_files:
|
||||
raise ValueError("Step 1 requires 'relevant_files' field to specify code files to generate tests for")
|
||||
return self
|
||||
|
||||
|
||||
class TestGenTool(WorkflowTool):
|
||||
"""
|
||||
Test Generation workflow tool for step-by-step test planning and expert validation.
|
||||
|
||||
This tool implements a structured test generation workflow that guides users through
|
||||
methodical investigation steps, ensuring thorough code examination, pattern identification,
|
||||
and test scenario planning before reaching conclusions. It supports complex testing scenarios
|
||||
including edge case identification, framework detection, and comprehensive coverage planning.
|
||||
"""
|
||||
|
||||
files: list[str] = Field(..., description=TESTGEN_FIELD_DESCRIPTIONS["files"])
|
||||
prompt: str = Field(..., description=TESTGEN_FIELD_DESCRIPTIONS["prompt"])
|
||||
test_examples: Optional[list[str]] = Field(None, description=TESTGEN_FIELD_DESCRIPTIONS["test_examples"])
|
||||
|
||||
|
||||
class TestGenerationTool(BaseTool):
|
||||
"""
|
||||
Test generation tool implementation.
|
||||
|
||||
This tool analyzes code to generate comprehensive test suites with
|
||||
edge case coverage, following existing test patterns when examples
|
||||
are provided.
|
||||
"""
|
||||
def __init__(self):
|
||||
super().__init__()
|
||||
self.initial_request = None
|
||||
|
||||
def get_name(self) -> str:
|
||||
return "testgen"
|
||||
@@ -75,390 +163,406 @@ class TestGenerationTool(BaseTool):
|
||||
"'Create tests for authentication error handling'. If user request is vague, either ask for "
|
||||
"clarification about specific components to test, or make focused scope decisions and explain them. "
|
||||
"Analyzes code paths, identifies realistic failure modes, and generates framework-specific tests. "
|
||||
"Supports test pattern following when examples are provided. "
|
||||
"Choose thinking_mode based on code complexity: 'low' for simple functions, "
|
||||
"'medium' for standard modules (default), 'high' for complex systems with many interactions, "
|
||||
"'max' for critical systems requiring exhaustive test coverage. "
|
||||
"Note: If you're not currently using a top-tier model such as Opus 4 or above, these tools can provide enhanced capabilities."
|
||||
"Supports test pattern following when examples are provided. Choose thinking_mode based on "
|
||||
"code complexity: 'low' for simple functions, 'medium' for standard modules (default), "
|
||||
"'high' for complex systems with many interactions, 'max' for critical systems requiring "
|
||||
"exhaustive test coverage. Note: If you're not currently using a top-tier model such as "
|
||||
"Opus 4 or above, these tools can provide enhanced capabilities."
|
||||
)
|
||||
|
||||
def get_input_schema(self) -> dict[str, Any]:
|
||||
schema = {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"files": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": TESTGEN_FIELD_DESCRIPTIONS["files"],
|
||||
},
|
||||
"model": self.get_model_field_schema(),
|
||||
"prompt": {
|
||||
"type": "string",
|
||||
"description": TESTGEN_FIELD_DESCRIPTIONS["prompt"],
|
||||
},
|
||||
"test_examples": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": TESTGEN_FIELD_DESCRIPTIONS["test_examples"],
|
||||
},
|
||||
"thinking_mode": {
|
||||
"type": "string",
|
||||
"enum": ["minimal", "low", "medium", "high", "max"],
|
||||
"description": "Thinking depth: minimal (0.5% of model max), low (8%), medium (33%), high (67%), max (100% of model max)",
|
||||
},
|
||||
"continuation_id": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"Thread continuation ID for multi-turn conversations. Can be used to continue conversations "
|
||||
"across different tools. Only provide this if continuing a previous conversation thread."
|
||||
),
|
||||
},
|
||||
},
|
||||
"required": ["files", "prompt"] + (["model"] if self.is_effective_auto_mode() else []),
|
||||
}
|
||||
|
||||
return schema
|
||||
|
||||
def get_system_prompt(self) -> str:
|
||||
return TESTGEN_PROMPT
|
||||
|
||||
def get_default_temperature(self) -> float:
|
||||
return TEMPERATURE_ANALYTICAL
|
||||
|
||||
# Line numbers are enabled by default from base class for precise targeting
|
||||
|
||||
def get_model_category(self):
|
||||
"""TestGen requires extended reasoning for comprehensive test analysis"""
|
||||
def get_model_category(self) -> "ToolModelCategory":
|
||||
"""Test generation requires thorough analysis and reasoning"""
|
||||
from tools.models import ToolModelCategory
|
||||
|
||||
return ToolModelCategory.EXTENDED_REASONING
|
||||
|
||||
def get_request_model(self):
|
||||
return TestGenerationRequest
|
||||
def get_workflow_request_model(self):
|
||||
"""Return the test generation workflow-specific request model."""
|
||||
return TestGenRequest
|
||||
|
||||
def _process_test_examples(
|
||||
self, test_examples: list[str], continuation_id: Optional[str], available_tokens: int = None
|
||||
) -> tuple[str, str]:
|
||||
"""
|
||||
Process test example files using available token budget for optimal sampling.
|
||||
def get_input_schema(self) -> dict[str, Any]:
|
||||
"""Generate input schema using WorkflowSchemaBuilder with test generation-specific overrides."""
|
||||
from .workflow.schema_builders import WorkflowSchemaBuilder
|
||||
|
||||
Args:
|
||||
test_examples: List of test file paths
|
||||
continuation_id: Continuation ID for filtering already embedded files
|
||||
available_tokens: Available token budget for test examples
|
||||
# Test generation workflow-specific field overrides
|
||||
testgen_field_overrides = {
|
||||
"step": {
|
||||
"type": "string",
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["step"],
|
||||
},
|
||||
"step_number": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["step_number"],
|
||||
},
|
||||
"total_steps": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["total_steps"],
|
||||
},
|
||||
"next_step_required": {
|
||||
"type": "boolean",
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["next_step_required"],
|
||||
},
|
||||
"findings": {
|
||||
"type": "string",
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["findings"],
|
||||
},
|
||||
"files_checked": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["files_checked"],
|
||||
},
|
||||
"relevant_files": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["relevant_files"],
|
||||
},
|
||||
"confidence": {
|
||||
"type": "string",
|
||||
"enum": ["exploring", "low", "medium", "high", "certain"],
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["confidence"],
|
||||
},
|
||||
"backtrack_from_step": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["backtrack_from_step"],
|
||||
},
|
||||
"images": {
|
||||
"type": "array",
|
||||
"items": {"type": "string"},
|
||||
"description": TESTGEN_WORKFLOW_FIELD_DESCRIPTIONS["images"],
|
||||
},
|
||||
}
|
||||
|
||||
Returns:
|
||||
tuple: (formatted_content, summary_note)
|
||||
"""
|
||||
logger.debug(f"[TESTGEN] Processing {len(test_examples)} test examples")
|
||||
|
||||
if not test_examples:
|
||||
logger.debug("[TESTGEN] No test examples provided")
|
||||
return "", ""
|
||||
|
||||
# Use existing file filtering to avoid duplicates in continuation
|
||||
examples_to_process = self.filter_new_files(test_examples, continuation_id)
|
||||
logger.debug(f"[TESTGEN] After filtering: {len(examples_to_process)} new test examples to process")
|
||||
|
||||
if not examples_to_process:
|
||||
logger.info(f"[TESTGEN] All {len(test_examples)} test examples already in conversation history")
|
||||
return "", ""
|
||||
|
||||
logger.debug(f"[TESTGEN] Processing {len(examples_to_process)} file paths")
|
||||
|
||||
# Calculate token budget for test examples (25% of available tokens, or fallback)
|
||||
if available_tokens:
|
||||
test_examples_budget = int(available_tokens * 0.25) # 25% for test examples
|
||||
logger.debug(
|
||||
f"[TESTGEN] Allocating {test_examples_budget:,} tokens (25% of {available_tokens:,}) for test examples"
|
||||
)
|
||||
else:
|
||||
test_examples_budget = 30000 # Fallback if no budget provided
|
||||
logger.debug(f"[TESTGEN] Using fallback budget of {test_examples_budget:,} tokens for test examples")
|
||||
|
||||
original_count = len(examples_to_process)
|
||||
logger.debug(
|
||||
f"[TESTGEN] Processing {original_count} test example files with {test_examples_budget:,} token budget"
|
||||
# Use WorkflowSchemaBuilder with test generation-specific tool fields
|
||||
return WorkflowSchemaBuilder.build_schema(
|
||||
tool_specific_fields=testgen_field_overrides,
|
||||
model_field_schema=self.get_model_field_schema(),
|
||||
auto_mode=self.is_effective_auto_mode(),
|
||||
tool_name=self.get_name(),
|
||||
)
|
||||
|
||||
# Sort by file size (smallest first) for pattern-focused selection
|
||||
file_sizes = []
|
||||
for file_path in examples_to_process:
|
||||
try:
|
||||
size = os.path.getsize(file_path)
|
||||
file_sizes.append((file_path, size))
|
||||
logger.debug(f"[TESTGEN] Test example {os.path.basename(file_path)}: {size:,} bytes")
|
||||
except (OSError, FileNotFoundError) as e:
|
||||
# If we can't get size, put it at the end
|
||||
logger.warning(f"[TESTGEN] Could not get size for {file_path}: {e}")
|
||||
file_sizes.append((file_path, float("inf")))
|
||||
|
||||
# Sort by size and take smallest files for pattern reference
|
||||
file_sizes.sort(key=lambda x: x[1])
|
||||
examples_to_process = [f[0] for f in file_sizes] # All files, sorted by size
|
||||
logger.debug(
|
||||
f"[TESTGEN] Sorted test examples by size (smallest first): {[os.path.basename(f) for f in examples_to_process]}"
|
||||
)
|
||||
|
||||
# Use standard file content preparation with dynamic token budget
|
||||
try:
|
||||
logger.debug(f"[TESTGEN] Preparing file content for {len(examples_to_process)} test examples")
|
||||
content, processed_files = self._prepare_file_content_for_prompt(
|
||||
examples_to_process,
|
||||
continuation_id,
|
||||
"Test examples",
|
||||
max_tokens=test_examples_budget,
|
||||
reserve_tokens=1000,
|
||||
)
|
||||
# Store processed files for tracking - test examples are tracked separately from main code files
|
||||
|
||||
# Determine how many files were actually included
|
||||
if content:
|
||||
from utils.token_utils import estimate_tokens
|
||||
|
||||
used_tokens = estimate_tokens(content)
|
||||
logger.info(
|
||||
f"[TESTGEN] Successfully embedded test examples: {used_tokens:,} tokens used ({test_examples_budget:,} available)"
|
||||
)
|
||||
if original_count > 1:
|
||||
truncation_note = f"Note: Used {used_tokens:,} tokens ({test_examples_budget:,} available) for test examples from {original_count} files to determine testing patterns."
|
||||
else:
|
||||
truncation_note = ""
|
||||
else:
|
||||
logger.warning("[TESTGEN] No content generated for test examples")
|
||||
truncation_note = ""
|
||||
|
||||
return content, truncation_note
|
||||
|
||||
except Exception as e:
|
||||
# If test example processing fails, continue without examples rather than failing
|
||||
logger.error(f"[TESTGEN] Failed to process test examples: {type(e).__name__}: {e}")
|
||||
return "", f"Warning: Could not process test examples: {str(e)}"
|
||||
|
||||
async def prepare_prompt(self, request: TestGenerationRequest) -> str:
|
||||
"""
|
||||
Prepare the test generation prompt with code analysis and optional test examples.
|
||||
|
||||
This method reads the requested files, processes any test examples,
|
||||
and constructs a detailed prompt for comprehensive test generation.
|
||||
|
||||
Args:
|
||||
request: The validated test generation request
|
||||
|
||||
Returns:
|
||||
str: Complete prompt for the model
|
||||
|
||||
Raises:
|
||||
ValueError: If the code exceeds token limits
|
||||
"""
|
||||
logger.debug(f"[TESTGEN] Preparing prompt for {len(request.files)} code files")
|
||||
if request.test_examples:
|
||||
logger.debug(f"[TESTGEN] Including {len(request.test_examples)} test examples for pattern reference")
|
||||
# Check for prompt.txt in files
|
||||
prompt_content, updated_files = self.handle_prompt_file(request.files)
|
||||
|
||||
# If prompt.txt was found, incorporate it into the prompt
|
||||
if prompt_content:
|
||||
logger.debug("[TESTGEN] Found prompt.txt file, incorporating content")
|
||||
request.prompt = prompt_content + "\n\n" + request.prompt
|
||||
|
||||
# Update request files list
|
||||
if updated_files is not None:
|
||||
logger.debug(f"[TESTGEN] Updated files list after prompt.txt processing: {len(updated_files)} files")
|
||||
request.files = updated_files
|
||||
|
||||
# Check user input size at MCP transport boundary (before adding internal content)
|
||||
user_content = request.prompt
|
||||
size_check = self.check_prompt_size(user_content)
|
||||
if size_check:
|
||||
from tools.models import ToolOutput
|
||||
|
||||
raise ValueError(f"MCP_SIZE_CHECK:{ToolOutput(**size_check).model_dump_json()}")
|
||||
|
||||
# Calculate available token budget for dynamic allocation
|
||||
continuation_id = getattr(request, "continuation_id", None)
|
||||
|
||||
# Get model context for token budget calculation
|
||||
available_tokens = None
|
||||
|
||||
if hasattr(self, "_model_context") and self._model_context:
|
||||
try:
|
||||
capabilities = self._model_context.capabilities
|
||||
# Use 75% of context for content (code + test examples), 25% for response
|
||||
available_tokens = int(capabilities.context_window * 0.75)
|
||||
logger.debug(
|
||||
f"[TESTGEN] Token budget calculation: {available_tokens:,} tokens (75% of {capabilities.context_window:,}) for model {self._model_context.model_name}"
|
||||
)
|
||||
except Exception as e:
|
||||
# Fallback to conservative estimate
|
||||
logger.warning(f"[TESTGEN] Could not get model capabilities: {e}")
|
||||
available_tokens = 120000 # Conservative fallback
|
||||
logger.debug(f"[TESTGEN] Using fallback token budget: {available_tokens:,} tokens")
|
||||
def get_required_actions(self, step_number: int, confidence: str, findings: str, total_steps: int) -> list[str]:
|
||||
"""Define required actions for each investigation phase."""
|
||||
if step_number == 1:
|
||||
# Initial test generation investigation tasks
|
||||
return [
|
||||
"Read and understand the code files specified for test generation",
|
||||
"Analyze the overall structure, public APIs, and main functionality",
|
||||
"Identify critical business logic and complex algorithms that need testing",
|
||||
"Look for existing test patterns or examples if provided",
|
||||
"Understand dependencies, external interactions, and integration points",
|
||||
"Note any potential testability issues or areas that might be hard to test",
|
||||
]
|
||||
elif confidence in ["exploring", "low"]:
|
||||
# Need deeper investigation
|
||||
return [
|
||||
"Examine specific functions and methods to understand their behavior",
|
||||
"Trace through code paths to identify all possible execution flows",
|
||||
"Identify edge cases, boundary conditions, and error scenarios",
|
||||
"Check for async operations, state management, and side effects",
|
||||
"Look for non-deterministic behavior or external dependencies",
|
||||
"Analyze error handling and exception cases that need testing",
|
||||
]
|
||||
elif confidence in ["medium", "high"]:
|
||||
# Close to completion - need final verification
|
||||
return [
|
||||
"Verify all critical paths have been identified for testing",
|
||||
"Confirm edge cases and boundary conditions are comprehensive",
|
||||
"Check that test scenarios cover both success and failure cases",
|
||||
"Ensure async behavior and concurrency issues are addressed",
|
||||
"Validate that the testing strategy aligns with code complexity",
|
||||
"Double-check that findings include actionable test scenarios",
|
||||
]
|
||||
else:
|
||||
# No model context available (shouldn't happen in normal flow)
|
||||
available_tokens = 120000 # Conservative fallback
|
||||
logger.debug(f"[TESTGEN] No model context, using fallback token budget: {available_tokens:,} tokens")
|
||||
|
||||
# Process test examples first to determine token allocation
|
||||
test_examples_content = ""
|
||||
test_examples_note = ""
|
||||
|
||||
if request.test_examples:
|
||||
logger.debug(f"[TESTGEN] Processing {len(request.test_examples)} test examples")
|
||||
test_examples_content, test_examples_note = self._process_test_examples(
|
||||
request.test_examples, continuation_id, available_tokens
|
||||
)
|
||||
if test_examples_content:
|
||||
logger.info("[TESTGEN] Test examples processed successfully for pattern reference")
|
||||
else:
|
||||
logger.info("[TESTGEN] No test examples content after processing")
|
||||
|
||||
# Remove files that appear in both 'files' and 'test_examples' to avoid duplicate embedding
|
||||
# Files in test_examples take precedence as they're used for pattern reference
|
||||
code_files_to_process = request.files.copy()
|
||||
if request.test_examples:
|
||||
# Normalize paths for comparison (resolve any relative paths, handle case sensitivity)
|
||||
test_example_set = {os.path.normpath(os.path.abspath(f)) for f in request.test_examples}
|
||||
original_count = len(code_files_to_process)
|
||||
|
||||
code_files_to_process = [
|
||||
f for f in code_files_to_process if os.path.normpath(os.path.abspath(f)) not in test_example_set
|
||||
# General investigation needed
|
||||
return [
|
||||
"Continue examining the codebase for additional test scenarios",
|
||||
"Gather more evidence about code behavior and dependencies",
|
||||
"Test your assumptions about how the code should be tested",
|
||||
"Look for patterns that confirm your testing strategy",
|
||||
"Focus on areas that haven't been thoroughly examined yet",
|
||||
]
|
||||
|
||||
duplicates_removed = original_count - len(code_files_to_process)
|
||||
if duplicates_removed > 0:
|
||||
logger.info(
|
||||
f"[TESTGEN] Removed {duplicates_removed} duplicate files from code files list "
|
||||
f"(already included in test examples for pattern reference)"
|
||||
)
|
||||
def should_call_expert_analysis(self, consolidated_findings, request=None) -> bool:
|
||||
"""
|
||||
Decide when to call external model based on investigation completeness.
|
||||
|
||||
# Calculate remaining tokens for main code after test examples
|
||||
if test_examples_content and available_tokens:
|
||||
from utils.token_utils import estimate_tokens
|
||||
Always call expert analysis for test generation to get additional test ideas.
|
||||
"""
|
||||
# Check if user requested to skip assistant model
|
||||
if request and not self.get_request_use_assistant_model(request):
|
||||
return False
|
||||
|
||||
test_tokens = estimate_tokens(test_examples_content)
|
||||
remaining_tokens = available_tokens - test_tokens - 5000 # Reserve for prompt structure
|
||||
logger.debug(
|
||||
f"[TESTGEN] Token allocation: {test_tokens:,} for examples, {remaining_tokens:,} remaining for code files"
|
||||
# Always benefit from expert analysis for comprehensive test coverage
|
||||
return len(consolidated_findings.relevant_files) > 0 or len(consolidated_findings.findings) >= 1
|
||||
|
||||
def prepare_expert_analysis_context(self, consolidated_findings) -> str:
|
||||
"""Prepare context for external model call for test generation validation."""
|
||||
context_parts = [
|
||||
f"=== TEST GENERATION REQUEST ===\\n{self.initial_request or 'Test generation workflow initiated'}\\n=== END REQUEST ==="
|
||||
]
|
||||
|
||||
# Add investigation summary
|
||||
investigation_summary = self._build_test_generation_summary(consolidated_findings)
|
||||
context_parts.append(
|
||||
f"\\n=== CLAUDE'S TEST PLANNING INVESTIGATION ===\\n{investigation_summary}\\n=== END INVESTIGATION ==="
|
||||
)
|
||||
|
||||
# Add relevant code elements if available
|
||||
if consolidated_findings.relevant_context:
|
||||
methods_text = "\\n".join(f"- {method}" for method in consolidated_findings.relevant_context)
|
||||
context_parts.append(f"\\n=== CODE ELEMENTS TO TEST ===\\n{methods_text}\\n=== END CODE ELEMENTS ===")
|
||||
|
||||
# Add images if available
|
||||
if consolidated_findings.images:
|
||||
images_text = "\\n".join(f"- {img}" for img in consolidated_findings.images)
|
||||
context_parts.append(f"\\n=== VISUAL DOCUMENTATION ===\\n{images_text}\\n=== END VISUAL DOCUMENTATION ===")
|
||||
|
||||
return "\\n".join(context_parts)
|
||||
|
||||
def _build_test_generation_summary(self, consolidated_findings) -> str:
|
||||
"""Prepare a comprehensive summary of the test generation investigation."""
|
||||
summary_parts = [
|
||||
"=== SYSTEMATIC TEST GENERATION INVESTIGATION SUMMARY ===",
|
||||
f"Total steps: {len(consolidated_findings.findings)}",
|
||||
f"Files examined: {len(consolidated_findings.files_checked)}",
|
||||
f"Relevant files identified: {len(consolidated_findings.relevant_files)}",
|
||||
f"Code elements to test: {len(consolidated_findings.relevant_context)}",
|
||||
"",
|
||||
"=== INVESTIGATION PROGRESSION ===",
|
||||
]
|
||||
|
||||
for finding in consolidated_findings.findings:
|
||||
summary_parts.append(finding)
|
||||
|
||||
return "\\n".join(summary_parts)
|
||||
|
||||
def should_include_files_in_expert_prompt(self) -> bool:
|
||||
"""Include files in expert analysis for comprehensive test generation."""
|
||||
return True
|
||||
|
||||
def should_embed_system_prompt(self) -> bool:
|
||||
"""Embed system prompt in expert analysis for proper context."""
|
||||
return True
|
||||
|
||||
def get_expert_thinking_mode(self) -> str:
|
||||
"""Use high thinking mode for thorough test generation analysis."""
|
||||
return "high"
|
||||
|
||||
def get_expert_analysis_instruction(self) -> str:
|
||||
"""Get specific instruction for test generation expert analysis."""
|
||||
return (
|
||||
"Please provide comprehensive test generation guidance based on the investigation findings. "
|
||||
"Focus on identifying additional test scenarios, edge cases not yet covered, framework-specific "
|
||||
"best practices, and providing concrete test implementation examples following the multi-agent "
|
||||
"workflow specified in the system prompt."
|
||||
)
|
||||
|
||||
# Hook method overrides for test generation-specific behavior
|
||||
|
||||
def prepare_step_data(self, request) -> dict:
|
||||
"""
|
||||
Map test generation-specific fields for internal processing.
|
||||
"""
|
||||
step_data = {
|
||||
"step": request.step,
|
||||
"step_number": request.step_number,
|
||||
"findings": request.findings,
|
||||
"files_checked": request.files_checked,
|
||||
"relevant_files": request.relevant_files,
|
||||
"relevant_context": request.relevant_context,
|
||||
"confidence": request.confidence,
|
||||
"images": request.images or [],
|
||||
}
|
||||
return step_data
|
||||
|
||||
def should_skip_expert_analysis(self, request, consolidated_findings) -> bool:
|
||||
"""
|
||||
Test generation workflow skips expert analysis when Claude has "certain" confidence.
|
||||
"""
|
||||
return request.confidence == "certain" and not request.next_step_required
|
||||
|
||||
def store_initial_issue(self, step_description: str):
|
||||
"""Store initial request for expert analysis."""
|
||||
self.initial_request = step_description
|
||||
|
||||
# Override inheritance hooks for test generation-specific behavior
|
||||
|
||||
def get_completion_status(self) -> str:
|
||||
"""Test generation tools use test-specific status."""
|
||||
return "test_generation_complete_ready_for_implementation"
|
||||
|
||||
def get_completion_data_key(self) -> str:
|
||||
"""Test generation uses 'complete_test_generation' key."""
|
||||
return "complete_test_generation"
|
||||
|
||||
def get_final_analysis_from_request(self, request):
|
||||
"""Test generation tools use findings for final analysis."""
|
||||
return request.findings
|
||||
|
||||
def get_confidence_level(self, request) -> str:
|
||||
"""Test generation tools use 'certain' for high confidence."""
|
||||
return "certain"
|
||||
|
||||
def get_completion_message(self) -> str:
|
||||
"""Test generation-specific completion message."""
|
||||
return (
|
||||
"Test generation analysis complete with CERTAIN confidence. You have identified all test scenarios "
|
||||
"and provided comprehensive coverage strategy. MANDATORY: Present the user with the complete test plan "
|
||||
"and IMMEDIATELY proceed with creating the test files following the identified patterns and framework. "
|
||||
"Focus on implementing concrete, runnable tests with proper assertions."
|
||||
)
|
||||
|
||||
def get_skip_reason(self) -> str:
|
||||
"""Test generation-specific skip reason."""
|
||||
return "Claude completed comprehensive test planning with full confidence"
|
||||
|
||||
def get_skip_expert_analysis_status(self) -> str:
|
||||
"""Test generation-specific expert analysis skip status."""
|
||||
return "skipped_due_to_certain_test_confidence"
|
||||
|
||||
def prepare_work_summary(self) -> str:
|
||||
"""Test generation-specific work summary."""
|
||||
return self._build_test_generation_summary(self.consolidated_findings)
|
||||
|
||||
def get_completion_next_steps_message(self, expert_analysis_used: bool = False) -> str:
|
||||
"""
|
||||
Test generation-specific completion message.
|
||||
"""
|
||||
base_message = (
|
||||
"TEST GENERATION ANALYSIS IS COMPLETE. You MUST now implement ALL identified test scenarios, "
|
||||
"creating comprehensive test files that cover happy paths, edge cases, error conditions, and "
|
||||
"boundary scenarios. Organize tests by functionality, use appropriate assertions, and follow "
|
||||
"the identified framework patterns. Provide concrete, executable test code—make it easy for "
|
||||
"a developer to run the tests and understand what each test validates."
|
||||
)
|
||||
|
||||
# Add expert analysis guidance only when expert analysis was actually used
|
||||
if expert_analysis_used:
|
||||
expert_guidance = self.get_expert_analysis_guidance()
|
||||
if expert_guidance:
|
||||
return f"{base_message}\\n\\n{expert_guidance}"
|
||||
|
||||
return base_message
|
||||
|
||||
def get_expert_analysis_guidance(self) -> str:
|
||||
"""
|
||||
Provide specific guidance for handling expert analysis in test generation.
|
||||
"""
|
||||
return (
|
||||
"IMPORTANT: Additional test scenarios and edge cases have been provided by the expert analysis above. "
|
||||
"You MUST incorporate these suggestions into your test implementation, ensuring comprehensive coverage. "
|
||||
"Validate that the expert's test ideas are practical and align with the codebase structure. Combine "
|
||||
"your systematic investigation findings with the expert's additional scenarios to create a thorough "
|
||||
"test suite that catches real-world bugs before they reach production."
|
||||
)
|
||||
|
||||
def get_step_guidance_message(self, request) -> str:
|
||||
"""
|
||||
Test generation-specific step guidance with detailed investigation instructions.
|
||||
"""
|
||||
step_guidance = self.get_test_generation_step_guidance(request.step_number, request.confidence, request)
|
||||
return step_guidance["next_steps"]
|
||||
|
||||
def get_test_generation_step_guidance(self, step_number: int, confidence: str, request) -> dict[str, Any]:
|
||||
"""
|
||||
Provide step-specific guidance for test generation workflow.
|
||||
"""
|
||||
# Generate the next steps instruction based on required actions
|
||||
required_actions = self.get_required_actions(step_number, confidence, request.findings, request.total_steps)
|
||||
|
||||
if step_number == 1:
|
||||
next_steps = (
|
||||
f"MANDATORY: DO NOT call the {self.get_name()} tool again immediately. You MUST first analyze "
|
||||
f"the code thoroughly using appropriate tools. CRITICAL AWARENESS: You need to understand "
|
||||
f"the code structure, identify testable behaviors, find edge cases and boundary conditions, "
|
||||
f"and determine the appropriate testing strategy. Use file reading tools, code analysis, and "
|
||||
f"systematic examination to gather comprehensive information about what needs to be tested. "
|
||||
f"Only call {self.get_name()} again AFTER completing your investigation. When you call "
|
||||
f"{self.get_name()} next time, use step_number: {step_number + 1} and report specific "
|
||||
f"code paths examined, test scenarios identified, and testing patterns discovered."
|
||||
)
|
||||
elif confidence in ["exploring", "low"]:
|
||||
next_steps = (
|
||||
f"STOP! Do NOT call {self.get_name()} again yet. Based on your findings, you've identified areas that need "
|
||||
f"deeper analysis for test generation. MANDATORY ACTIONS before calling {self.get_name()} step {step_number + 1}:\\n"
|
||||
+ "\\n".join(f"{i+1}. {action}" for i, action in enumerate(required_actions))
|
||||
+ f"\\n\\nOnly call {self.get_name()} again with step_number: {step_number + 1} AFTER "
|
||||
+ "completing these test planning tasks."
|
||||
)
|
||||
elif confidence in ["medium", "high"]:
|
||||
next_steps = (
|
||||
f"WAIT! Your test generation analysis needs final verification. DO NOT call {self.get_name()} immediately. REQUIRED ACTIONS:\\n"
|
||||
+ "\\n".join(f"{i+1}. {action}" for i, action in enumerate(required_actions))
|
||||
+ f"\\n\\nREMEMBER: Ensure you have identified all test scenarios including edge cases and error conditions. "
|
||||
f"Document findings with specific test cases to implement, then call {self.get_name()} "
|
||||
f"with step_number: {step_number + 1}."
|
||||
)
|
||||
else:
|
||||
remaining_tokens = available_tokens - 10000 if available_tokens else None
|
||||
if remaining_tokens:
|
||||
logger.debug(
|
||||
f"[TESTGEN] Token allocation: {remaining_tokens:,} tokens available for code files (no test examples)"
|
||||
)
|
||||
|
||||
# Use centralized file processing logic for main code files (after deduplication)
|
||||
logger.debug(f"[TESTGEN] Preparing {len(code_files_to_process)} code files for analysis")
|
||||
code_content, processed_files = self._prepare_file_content_for_prompt(
|
||||
code_files_to_process, continuation_id, "Code to test", max_tokens=remaining_tokens, reserve_tokens=2000
|
||||
)
|
||||
self._actually_processed_files = processed_files
|
||||
|
||||
if code_content:
|
||||
from utils.token_utils import estimate_tokens
|
||||
|
||||
code_tokens = estimate_tokens(code_content)
|
||||
logger.info(f"[TESTGEN] Code files embedded successfully: {code_tokens:,} tokens")
|
||||
else:
|
||||
logger.warning("[TESTGEN] No code content after file processing")
|
||||
|
||||
# Test generation is based on code analysis, no web search needed
|
||||
logger.debug("[TESTGEN] Building complete test generation prompt")
|
||||
|
||||
# Build the complete prompt
|
||||
prompt_parts = []
|
||||
|
||||
# Add system prompt
|
||||
prompt_parts.append(self.get_system_prompt())
|
||||
|
||||
# Add user context
|
||||
prompt_parts.append("=== USER CONTEXT ===")
|
||||
prompt_parts.append(request.prompt)
|
||||
prompt_parts.append("=== END CONTEXT ===")
|
||||
|
||||
# Add test examples if provided
|
||||
if test_examples_content:
|
||||
prompt_parts.append("\n=== TEST EXAMPLES FOR STYLE REFERENCE ===")
|
||||
if test_examples_note:
|
||||
prompt_parts.append(f"// {test_examples_note}")
|
||||
prompt_parts.append(test_examples_content)
|
||||
prompt_parts.append("=== END TEST EXAMPLES ===")
|
||||
|
||||
# Add main code to test
|
||||
prompt_parts.append("\n=== CODE TO TEST ===")
|
||||
prompt_parts.append(code_content)
|
||||
prompt_parts.append("=== END CODE ===")
|
||||
|
||||
# Add generation instructions
|
||||
prompt_parts.append(
|
||||
"\nPlease analyze the code and generate comprehensive tests following the multi-agent workflow specified in the system prompt."
|
||||
)
|
||||
if test_examples_content:
|
||||
prompt_parts.append(
|
||||
"Use the provided test examples as a reference for style, framework, and testing patterns."
|
||||
next_steps = (
|
||||
f"PAUSE ANALYSIS. Before calling {self.get_name()} step {step_number + 1}, you MUST examine more code thoroughly. "
|
||||
+ "Required: "
|
||||
+ ", ".join(required_actions[:2])
|
||||
+ ". "
|
||||
+ f"Your next {self.get_name()} call (step_number: {step_number + 1}) must include "
|
||||
f"NEW test scenarios from actual code analysis, not just theories. NO recursive {self.get_name()} calls "
|
||||
f"without investigation work!"
|
||||
)
|
||||
|
||||
full_prompt = "\n".join(prompt_parts)
|
||||
return {"next_steps": next_steps}
|
||||
|
||||
# Log final prompt statistics
|
||||
from utils.token_utils import estimate_tokens
|
||||
|
||||
total_tokens = estimate_tokens(full_prompt)
|
||||
logger.info(f"[TESTGEN] Complete prompt prepared: {total_tokens:,} tokens, {len(full_prompt):,} characters")
|
||||
|
||||
return full_prompt
|
||||
|
||||
def format_response(self, response: str, request: TestGenerationRequest, model_info: Optional[dict] = None) -> str:
|
||||
def customize_workflow_response(self, response_data: dict, request) -> dict:
|
||||
"""
|
||||
Format the test generation response.
|
||||
|
||||
Args:
|
||||
response: The raw test generation from the model
|
||||
request: The original request for context
|
||||
model_info: Optional dict with model metadata
|
||||
|
||||
Returns:
|
||||
str: Formatted response with next steps
|
||||
Customize response to match test generation workflow format.
|
||||
"""
|
||||
return f"""{response}
|
||||
# Store initial request on first step
|
||||
if request.step_number == 1:
|
||||
self.initial_request = request.step
|
||||
|
||||
---
|
||||
# Convert generic status names to test generation-specific ones
|
||||
tool_name = self.get_name()
|
||||
status_mapping = {
|
||||
f"{tool_name}_in_progress": "test_generation_in_progress",
|
||||
f"pause_for_{tool_name}": "pause_for_test_analysis",
|
||||
f"{tool_name}_required": "test_analysis_required",
|
||||
f"{tool_name}_complete": "test_generation_complete",
|
||||
}
|
||||
|
||||
Claude, you are now in EXECUTION MODE. Take immediate action:
|
||||
if response_data["status"] in status_mapping:
|
||||
response_data["status"] = status_mapping[response_data["status"]]
|
||||
|
||||
## Step 1: THINK & CREATE TESTS
|
||||
ULTRATHINK while creating these in order to verify that every code reference, import, function name, and logic path is
|
||||
100% accurate before saving.
|
||||
# Rename status field to match test generation workflow
|
||||
if f"{tool_name}_status" in response_data:
|
||||
response_data["test_generation_status"] = response_data.pop(f"{tool_name}_status")
|
||||
# Add test generation-specific status fields
|
||||
response_data["test_generation_status"]["test_scenarios_identified"] = len(
|
||||
self.consolidated_findings.relevant_context
|
||||
)
|
||||
response_data["test_generation_status"]["analysis_confidence"] = self.get_request_confidence(request)
|
||||
|
||||
- CREATE all test files in the correct project structure
|
||||
- SAVE each test using proper naming conventions
|
||||
- VALIDATE all imports, references, and dependencies are correct as required by the current framework / project / file
|
||||
# Map complete_testgen to complete_test_generation
|
||||
if f"complete_{tool_name}" in response_data:
|
||||
response_data["complete_test_generation"] = response_data.pop(f"complete_{tool_name}")
|
||||
|
||||
## Step 2: DISPLAY RESULTS TO USER
|
||||
After creating each test file, MUST show the user:
|
||||
```
|
||||
✅ Created: path/to/test_file.py
|
||||
- test_function_name(): Brief description of what it tests
|
||||
- test_another_function(): Brief description
|
||||
- [Total: X test functions]
|
||||
```
|
||||
# Map the completion flag to match test generation workflow
|
||||
if f"{tool_name}_complete" in response_data:
|
||||
response_data["test_generation_complete"] = response_data.pop(f"{tool_name}_complete")
|
||||
|
||||
## Step 3: VALIDATE BY EXECUTION
|
||||
CRITICAL: Run the tests immediately to confirm they work:
|
||||
- Install any missing dependencies first or request user to perform step if this cannot be automated
|
||||
- Execute the test suite
|
||||
- Fix any failures or errors
|
||||
- Confirm 100% pass rate. If there's a failure, re-iterate, go over each test, validate and understand why it's failing
|
||||
return response_data
|
||||
|
||||
## Step 4: INTEGRATION VERIFICATION
|
||||
- Verify tests integrate with existing test infrastructure
|
||||
- Confirm test discovery works
|
||||
- Validate test naming and organization
|
||||
# Required abstract methods from BaseTool
|
||||
def get_request_model(self):
|
||||
"""Return the test generation workflow-specific request model."""
|
||||
return TestGenRequest
|
||||
|
||||
## Step 5: MOVE TO NEXT ACTION
|
||||
Once tests are confirmed working, immediately proceed to the next logical step for the project.
|
||||
|
||||
MANDATORY: Do NOT stop after generating - you MUST create, validate, run, and confirm the tests work and all of the
|
||||
steps listed above are carried out correctly. Take full ownership of the testing implementation and move to your
|
||||
next work. If you were supplied a more_work_required request in the response above, you MUST honor it."""
|
||||
async def prepare_prompt(self, request) -> str:
|
||||
"""Not used - workflow tools use execute_workflow()."""
|
||||
return "" # Workflow tools use execute_workflow() directly
|
||||
|
||||
Reference in New Issue
Block a user