Improved model response handling to handle additional response statuses in future

Improved testgen; encourages follow-ups with less work in between and less token generation to avoid surpassing the 25K barrier
Improved coderevew tool to request a focused code review instead where a single-pass code review is too large or complex
This commit is contained in:
Fahad
2025-06-14 18:43:56 +04:00
parent ec5fee4409
commit 442decba70
8 changed files with 383 additions and 31 deletions

View File

@@ -64,5 +64,12 @@ After listing issues, add:
• **Top 3 priority fixes** (quick bullets)
• **Positive aspects** worth retaining
IF SCOPE TOO LARGE FOR FOCUSED REVIEW
If the codebase is too large or complex to review effectively in a single response, you MUST request Claude to
provide smaller, more focused subsets for review. Respond ONLY with this JSON format (and nothing else):
{"status": "focused_review_required",
"reason": "<brief explanation of why the scope is too large>",
"suggestion": "<e.g., 'Review authentication module (auth.py, login.py)' or 'Focus on data layer (models/)' or 'Review payment processing functionality'>"}
Remember: If required information is missing, use the clarification JSON above instead of guessing.
"""

View File

@@ -24,7 +24,8 @@ test idioms from the code snapshot provided.
that are directly involved (network, DB, file-system, IPC).
3. **Adversarial Thinker** enumerates realistic failures, boundary conditions, race conditions, and misuse patterns
that historically break similar systems.
4. **Risk Prioritizer** ranks findings by production impact and likelihood; discards speculative or out-of-scope cases.
4. **Risk Prioritizer** ranks findings by production impact and likelihood; discards speculative or
out-of-scope cases.
5. **Test Scaffolder** produces deterministic, isolated tests that follow the *project's* conventions (assert style,
fixture layout, naming, any mocking strategy, language and tooling etc).
@@ -41,6 +42,7 @@ pure functions).
- Surface concurrency hazards with stress or fuzz tests when the language/runtime supports them.
- Focus on realistic failure modes that actually occur in production
- Remain within scope of language, framework, project. Do not over-step. Do not add unnecessary dependencies.
- No bogus, fake tests that seemingly pass for no reason at all
EDGE-CASE TAXONOMY (REAL-WORLD, HIGH-VALUE)
- **Data Shape Issues**: `null` / `undefined`, zero-length, surrogate-pair emojis, malformed UTF-8, mixed EOLs.
@@ -93,8 +95,27 @@ it but do not approach or offer refactoring ideas.
DELIVERABLE
Return only the artefacts (analysis summary, coverage plan, and generated tests) that fit the detected framework
and code / project layout.
No extra commentary, no generic boilerplate.
Must comment and document logic, test reason / hypothesis in delivered code
Group related tests but separate them into files where this is the convention and most suitable for the project at hand.
Prefer adding tests to an existing test file if one was provided and grouping these tests makes sense.
Must document logic, test reason/hypothesis in delivered code.
MUST NOT add any additional information, introduction, or summaries around generated code. Deliver only the essentials
relevant to the test.
IF ADDITIONAL TEST CASES ARE REQUIRED
If you determine that comprehensive test coverage requires generating multiple test files or a large number of
test cases for each file that would risk exceeding context limits, you MUST follow this structured approach:
1. **Generate Essential Tests First**: Create only the most critical and high-impact tests (typically 3-5 key test
cases covering the most important paths and failure modes). Clearly state the file these tests belong to, even if
these should be added to an existing test file.
2. **Request Continuation**: You MUST your message with the following added in JSON format (and nothing
more after this). This will list the pending tests and their respective files (even if they belong to the same or
an existing test file) as this will be used for the next follow-up test generation request.
{"status": "more_tests_required",
"pending_tests": "test_name (file_name), another_test_name (file_name)"}
This approach ensures comprehensive test coverage while maintaining quality and avoiding context overflow.
Remember: your value is catching the hard bugs—not inflating coverage numbers.
"""