Document retry config location and verify planning mode logic

Add documentation for retry configuration in planning mode:
- Document retry settings in .g3.toml under [agent] section
- Note RetryConfig implementation in g3-core/src/retry.rs
- Clarify hardcoded vs config-based retry values

Verify existing retry loop and coach feedback parsing:
- Confirm execute_with_retry() handles recoverable errors
- Document feedback extraction source priority order
- Provide manual verification steps for testing
This commit is contained in:
Jochen
2025-12-11 14:56:27 +11:00
parent 1a13fc5345
commit 7b47495881
9 changed files with 1375 additions and 25 deletions

View File

@@ -0,0 +1,116 @@
{{CURRENT REQUIREMENTS}}
These requirements specify verification tasks for the planning mode's retry logic and coach
response parsing, along with documentation of where configuration is located.
## 1. Document Retry Configuration Location
**Goal**: Clarify where retry settings are configured for planning mode.
**Findings to document**:
1. Retry configuration is in the `.g3.toml` config file (or `config.example.toml` as template)
under the `[agent]` section:
```toml
[agent]
max_retry_attempts = 3 # Default mode retries
autonomous_max_retry_attempts = 6 # Used by planning/autonomous mode
```
2. The retry infrastructure is implemented in `crates/g3-core/src/retry.rs`:
- `RetryConfig` struct defines retry behavior per role
- `RetryConfig::planning("player")` and `RetryConfig::planning("coach")` create presets
- Default max retries is 3 (hardcoded in `RetryConfig::planning()`)
3. **Note**: Currently `RetryConfig::planning()` uses a hardcoded `max_retries: 3` rather than
reading from the config file's `autonomous_max_retry_attempts`. This may be intentional or
a gap to address.
**Required action**:
- add examples to config.example.toml for the coach and player retry configs.
## 2. Verify Retry Loop Functionality
**Goal**: Confirm that connection retry loops in planning mode work correctly for recoverable
errors.
**Verification approach**:
1. The retry logic is implemented in `g3_core::retry::execute_with_retry()` and is already
used by both player and coach phases in `run_coach_player_loop()` (planner.rs lines 633-640
and 682-689).
2. Error classification happens in `g3_core::error_handling::classify_error()` which identifies:
- `RecoverableError::RateLimit` (429 errors)
- `RecoverableError::NetworkError` (connection failures)
- `RecoverableError::ServerError` (5xx errors)
- `RecoverableError::Timeout` (request timeouts)
- `RecoverableError::ModelBusy` (capacity issues)
3. **Manual verification steps** (for a human tester):
- Run planning mode with a temporarily invalid API endpoint to trigger network errors
- Observe retry messages: `"⚠️ player error (attempt X/3): NetworkError - ..."`
- Observe backoff: `"🔄 Retrying player in Xs..."`
- After max retries, observe: `"🔄 Max retries (3) reached for player"`
4. **Existing test coverage**:
- `g3-core/src/retry.rs` has unit tests for `RetryConfig` construction
- `g3-core/src/error_handling.rs` has tests for `classify_error()` and delay calculations
**Required action**:
- No code changes needed if retry loops are already functioning.
- If issues are found during manual verification, document specific failure scenarios.
## 3. Verify Coach Response Parsing
**Goal**: Confirm that coach feedback extraction works correctly in planning mode.
**Current implementation**:
1. Coach feedback extraction uses `g3_core::feedback_extraction::extract_coach_feedback()`
(called at planner.rs ~line 695).
2. The extraction tries multiple sources in order:
- `FeedbackSource::SessionLog` - from session log JSON file
- `FeedbackSource::NativeToolCall` - from native tool call JSON in response
- `FeedbackSource::ConversationHistory` - from conversation history
- `FeedbackSource::TaskResultResponse` - from TaskResult parsing
- `FeedbackSource::DefaultFallback` - default message
3. Planning mode displays the extraction source:
```
📝 Coach feedback extracted from SessionLog: 1234 chars
```
**Verification approach**:
1. **Manual verification steps**:
- Run a planning mode session through at least one coach/player cycle
- Observe the feedback extraction message and confirm it shows a valid source
(preferably `SessionLog` or `NativeToolCall`, not `DefaultFallback`)
- Verify the first 25 lines of feedback are displayed correctly
- Confirm `IMPLEMENTATION_APPROVED` detection works when coach approves
2. **Existing test coverage**:
- `g3-core/src/feedback_extraction.rs` has comprehensive unit tests:
- `test_extract_balanced_json_*` - JSON parsing
- `test_try_extract_json_tool_call` - tool call extraction
- `test_is_final_output_tool_call_*` - detecting final_output calls
- `test_extracted_feedback_is_approved` - approval detection
**Required action**:
- No code changes needed if parsing is working correctly.
- If `DefaultFallback` is observed frequently during manual testing, investigate why
earlier extraction methods are failing and document findings.
## 4. Optional: Add Integration Test for Retry + Feedback Flow
**Goal**: Create a lightweight integration test that verifies the retry and feedback
extraction machinery works together.
**Scope**: Only implement if time permits and manual verification reveals issues.
**Approach**:
1. Create a test in `crates/g3-planner/tests/` that:
- Mocks an LLM provider that returns a `final_output` tool call
- Verifies `extract_coach_feedback()` successfully extracts the feedback
- Optionally simulates a recoverable error to test retry logic
2. This test should NOT require actual API calls or network access.

View File

@@ -0,0 +1,16 @@
# Planning Mode Verification Tasks
## 1. Document Retry Configuration Location
- [x] Add coach and player retry config examples to config.example.toml
- [x] Document the relationship between config file settings and RetryConfig::planning()
## 2. Verify Retry Loop Functionality
- [x] Review retry logic implementation (already done - looks correct)
- [x] Document verification findings
## 3. Verify Coach Response Parsing
- [x] Review feedback extraction implementation (already done - looks correct)
- [x] Document verification findings
## 4. Optional: Add Integration Test
- [x] Create integration test for retry + feedback extraction flow in g3-planner/tests/

View File

@@ -105,3 +105,15 @@
2025-12-11 10:05:02 USER SKIPPED RECOVERY
2025-12-11 10:05:08 - COMPLETED REQUIREMENTS (completed_requirements_2025-12-11_10-05-08.md, completed_todo_2025-12-11_10-05-08.md)
2025-12-11 10:05:39 - GIT COMMIT (Add explicit flush to append_entry and strengthen commit ordering docs)
2025-12-11 14:28:56 - REFINING REQUIREMENTS (new_requirements.md)
2025-12-11 14:32:53 - GIT HEAD (1a13fc5345dec72b7b97dcb6a397ac0b06cba3a2)
2025-12-11 14:32:58 - START IMPLEMENTING (current_requirements.md)
<<
Verify planning mode retry logic and coach response parsing. Document retry config location in .g3.toml under
[agent] section (max_retry_attempts, autonomous_max_retry_attempts). Note RetryConfig in retry.rs uses hardcoded max 3.
Add retry config examples to config.example.toml. Manual verification: test network errors trigger retries with backoff.
Coach feedback extraction uses multiple sources (SessionLog, NativeToolCall, etc) - verify non-fallback extraction.
Optional: add integration test for retry + feedback flow if issues found during manual testing.
>>
2025-12-11 14:55:22 - COMPLETED REQUIREMENTS (completed_requirements_2025-12-11_14-55-22.md, completed_todo_2025-12-11_14-55-22.md)
2025-12-11 14:56:27 - GIT COMMIT (Document retry config location and verify planning mode logic)