Document retry config location and verify planning mode logic

Add documentation for retry configuration in planning mode:
- Document retry settings in .g3.toml under [agent] section
- Note RetryConfig implementation in g3-core/src/retry.rs
- Clarify hardcoded vs config-based retry values

Verify existing retry loop and coach feedback parsing:
- Confirm execute_with_retry() handles recoverable errors
- Document feedback extraction source priority order
- Provide manual verification steps for testing
This commit is contained in:
Jochen
2025-12-11 14:56:27 +11:00
parent 1a13fc5345
commit 7b47495881
9 changed files with 1375 additions and 25 deletions

View File

@@ -0,0 +1,116 @@
{{CURRENT REQUIREMENTS}}
These requirements specify verification tasks for the planning mode's retry logic and coach
response parsing, along with documentation of where configuration is located.
## 1. Document Retry Configuration Location
**Goal**: Clarify where retry settings are configured for planning mode.
**Findings to document**:
1. Retry configuration is in the `.g3.toml` config file (or `config.example.toml` as template)
under the `[agent]` section:
```toml
[agent]
max_retry_attempts = 3 # Default mode retries
autonomous_max_retry_attempts = 6 # Used by planning/autonomous mode
```
2. The retry infrastructure is implemented in `crates/g3-core/src/retry.rs`:
- `RetryConfig` struct defines retry behavior per role
- `RetryConfig::planning("player")` and `RetryConfig::planning("coach")` create presets
- Default max retries is 3 (hardcoded in `RetryConfig::planning()`)
3. **Note**: Currently `RetryConfig::planning()` uses a hardcoded `max_retries: 3` rather than
reading from the config file's `autonomous_max_retry_attempts`. This may be intentional or
a gap to address.
**Required action**:
- add examples to config.example.toml for the coach and player retry configs.
## 2. Verify Retry Loop Functionality
**Goal**: Confirm that connection retry loops in planning mode work correctly for recoverable
errors.
**Verification approach**:
1. The retry logic is implemented in `g3_core::retry::execute_with_retry()` and is already
used by both player and coach phases in `run_coach_player_loop()` (planner.rs lines 633-640
and 682-689).
2. Error classification happens in `g3_core::error_handling::classify_error()` which identifies:
- `RecoverableError::RateLimit` (429 errors)
- `RecoverableError::NetworkError` (connection failures)
- `RecoverableError::ServerError` (5xx errors)
- `RecoverableError::Timeout` (request timeouts)
- `RecoverableError::ModelBusy` (capacity issues)
3. **Manual verification steps** (for a human tester):
- Run planning mode with a temporarily invalid API endpoint to trigger network errors
- Observe retry messages: `"⚠️ player error (attempt X/3): NetworkError - ..."`
- Observe backoff: `"🔄 Retrying player in Xs..."`
- After max retries, observe: `"🔄 Max retries (3) reached for player"`
4. **Existing test coverage**:
- `g3-core/src/retry.rs` has unit tests for `RetryConfig` construction
- `g3-core/src/error_handling.rs` has tests for `classify_error()` and delay calculations
**Required action**:
- No code changes needed if retry loops are already functioning.
- If issues are found during manual verification, document specific failure scenarios.
## 3. Verify Coach Response Parsing
**Goal**: Confirm that coach feedback extraction works correctly in planning mode.
**Current implementation**:
1. Coach feedback extraction uses `g3_core::feedback_extraction::extract_coach_feedback()`
(called at planner.rs ~line 695).
2. The extraction tries multiple sources in order:
- `FeedbackSource::SessionLog` - from session log JSON file
- `FeedbackSource::NativeToolCall` - from native tool call JSON in response
- `FeedbackSource::ConversationHistory` - from conversation history
- `FeedbackSource::TaskResultResponse` - from TaskResult parsing
- `FeedbackSource::DefaultFallback` - default message
3. Planning mode displays the extraction source:
```
📝 Coach feedback extracted from SessionLog: 1234 chars
```
**Verification approach**:
1. **Manual verification steps**:
- Run a planning mode session through at least one coach/player cycle
- Observe the feedback extraction message and confirm it shows a valid source
(preferably `SessionLog` or `NativeToolCall`, not `DefaultFallback`)
- Verify the first 25 lines of feedback are displayed correctly
- Confirm `IMPLEMENTATION_APPROVED` detection works when coach approves
2. **Existing test coverage**:
- `g3-core/src/feedback_extraction.rs` has comprehensive unit tests:
- `test_extract_balanced_json_*` - JSON parsing
- `test_try_extract_json_tool_call` - tool call extraction
- `test_is_final_output_tool_call_*` - detecting final_output calls
- `test_extracted_feedback_is_approved` - approval detection
**Required action**:
- No code changes needed if parsing is working correctly.
- If `DefaultFallback` is observed frequently during manual testing, investigate why
earlier extraction methods are failing and document findings.
## 4. Optional: Add Integration Test for Retry + Feedback Flow
**Goal**: Create a lightweight integration test that verifies the retry and feedback
extraction machinery works together.
**Scope**: Only implement if time permits and manual verification reveals issues.
**Approach**:
1. Create a test in `crates/g3-planner/tests/` that:
- Mocks an LLM provider that returns a `final_output` tool call
- Verifies `extract_coach_feedback()` successfully extracts the feedback
- Optionally simulates a recoverable error to test retry logic
2. This test should NOT require actual API calls or network access.