Document retry config location and verify planning mode logic

Add documentation for retry configuration in planning mode: - Document retry settings in .g3.toml under [agent] section - Note RetryConfig implementation in g3-core/src/retry.rs - Clarify hardcoded vs config-based retry values Verify existing retry loop and coach feedback parsing: - Confirm execute_with_retry() handles recoverable errors - Document feedback extraction source priority order - Provide manual verification steps for testing
2025-12-11 14:56:27 +11:00
parent 1a13fc5345
commit 7b47495881
9 changed files with 1375 additions and 25 deletions
--- a/g3-plan/completed_requirements_2025-12-11_14-55-22.md
+++ b/g3-plan/completed_requirements_2025-12-11_14-55-22.md
@@ -0,0 +1,116 @@
+{{CURRENT REQUIREMENTS}}
+
+These requirements specify verification tasks for the planning mode's retry logic and coach
+response parsing, along with documentation of where configuration is located.
+
+## 1. Document Retry Configuration Location
+
+**Goal**: Clarify where retry settings are configured for planning mode.
+
+**Findings to document**:
+1. Retry configuration is in the `.g3.toml` config file (or `config.example.toml` as template)
+   under the `[agent]` section:
+   ```toml
+   [agent]
+   max_retry_attempts = 3              # Default mode retries
+   autonomous_max_retry_attempts = 6   # Used by planning/autonomous mode
+   ```
+
+2. The retry infrastructure is implemented in `crates/g3-core/src/retry.rs`:
+   - `RetryConfig` struct defines retry behavior per role
+   - `RetryConfig::planning("player")` and `RetryConfig::planning("coach")` create presets
+   - Default max retries is 3 (hardcoded in `RetryConfig::planning()`)
+
+3. **Note**: Currently `RetryConfig::planning()` uses a hardcoded `max_retries: 3` rather than
+   reading from the config file's `autonomous_max_retry_attempts`. This may be intentional or
+   a gap to address.
+
+**Required action**:
+
+- add examples to config.example.toml for the coach and player retry configs.
+
+## 2. Verify Retry Loop Functionality
+
+**Goal**: Confirm that connection retry loops in planning mode work correctly for recoverable
+errors.
+
+**Verification approach**:
+1. The retry logic is implemented in `g3_core::retry::execute_with_retry()` and is already
+   used by both player and coach phases in `run_coach_player_loop()` (planner.rs lines 633-640
+   and 682-689).
+
+2. Error classification happens in `g3_core::error_handling::classify_error()` which identifies:
+   - `RecoverableError::RateLimit` (429 errors)
+   - `RecoverableError::NetworkError` (connection failures)
+   - `RecoverableError::ServerError` (5xx errors)
+   - `RecoverableError::Timeout` (request timeouts)
+   - `RecoverableError::ModelBusy` (capacity issues)
+
+3. **Manual verification steps** (for a human tester):
+   - Run planning mode with a temporarily invalid API endpoint to trigger network errors
+   - Observe retry messages: `"⚠️ player error (attempt X/3): NetworkError - ..."`
+   - Observe backoff: `"🔄 Retrying player in Xs..."`
+   - After max retries, observe: `"🔄 Max retries (3) reached for player"`
+
+4. **Existing test coverage**:
+   - `g3-core/src/retry.rs` has unit tests for `RetryConfig` construction
+   - `g3-core/src/error_handling.rs` has tests for `classify_error()` and delay calculations
+
+**Required action**:
+- No code changes needed if retry loops are already functioning.
+- If issues are found during manual verification, document specific failure scenarios.
+
+## 3. Verify Coach Response Parsing
+
+**Goal**: Confirm that coach feedback extraction works correctly in planning mode.
+
+**Current implementation**:
+1. Coach feedback extraction uses `g3_core::feedback_extraction::extract_coach_feedback()`
+   (called at planner.rs ~line 695).
+
+2. The extraction tries multiple sources in order:
+   - `FeedbackSource::SessionLog` - from session log JSON file
+   - `FeedbackSource::NativeToolCall` - from native tool call JSON in response
+   - `FeedbackSource::ConversationHistory` - from conversation history
+   - `FeedbackSource::TaskResultResponse` - from TaskResult parsing
+   - `FeedbackSource::DefaultFallback` - default message
+
+3. Planning mode displays the extraction source:
+   ```
+   📝 Coach feedback extracted from SessionLog: 1234 chars
+   ```
+
+**Verification approach**:
+1. **Manual verification steps**:
+   - Run a planning mode session through at least one coach/player cycle
+   - Observe the feedback extraction message and confirm it shows a valid source
+     (preferably `SessionLog` or `NativeToolCall`, not `DefaultFallback`)
+   - Verify the first 25 lines of feedback are displayed correctly
+   - Confirm `IMPLEMENTATION_APPROVED` detection works when coach approves
+
+2. **Existing test coverage**:
+   - `g3-core/src/feedback_extraction.rs` has comprehensive unit tests:
+     - `test_extract_balanced_json_*` - JSON parsing
+     - `test_try_extract_json_tool_call` - tool call extraction
+     - `test_is_final_output_tool_call_*` - detecting final_output calls
+     - `test_extracted_feedback_is_approved` - approval detection
+
+**Required action**:
+- No code changes needed if parsing is working correctly.
+- If `DefaultFallback` is observed frequently during manual testing, investigate why
+  earlier extraction methods are failing and document findings.
+
+## 4. Optional: Add Integration Test for Retry + Feedback Flow
+
+**Goal**: Create a lightweight integration test that verifies the retry and feedback
+extraction machinery works together.
+
+**Scope**: Only implement if time permits and manual verification reveals issues.
+
+**Approach**:
+1. Create a test in `crates/g3-planner/tests/` that:
+   - Mocks an LLM provider that returns a `final_output` tool call
+   - Verifies `extract_coach_feedback()` successfully extracts the feedback
+   - Optionally simulates a recoverable error to test retry logic
+
+2. This test should NOT require actual API calls or network access.
--- a/g3-plan/completed_todo_2025-12-11_14-55-22.md
+++ b/g3-plan/completed_todo_2025-12-11_14-55-22.md
@@ -0,0 +1,16 @@
+# Planning Mode Verification Tasks
+
+## 1. Document Retry Configuration Location
+- [x] Add coach and player retry config examples to config.example.toml
+- [x] Document the relationship between config file settings and RetryConfig::planning()
+
+## 2. Verify Retry Loop Functionality
+- [x] Review retry logic implementation (already done - looks correct)
+- [x] Document verification findings
+
+## 3. Verify Coach Response Parsing
+- [x] Review feedback extraction implementation (already done - looks correct)
+- [x] Document verification findings
+
+## 4. Optional: Add Integration Test
+- [x] Create integration test for retry + feedback extraction flow in g3-planner/tests/
--- a/g3-plan/planner_history.txt
+++ b/g3-plan/planner_history.txt
@@ -105,3 +105,15 @@
 2025-12-11 10:05:02  USER SKIPPED RECOVERY
 2025-12-11 10:05:08 - COMPLETED REQUIREMENTS (completed_requirements_2025-12-11_10-05-08.md,  completed_todo_2025-12-11_10-05-08.md)
 2025-12-11 10:05:39 - GIT COMMIT (Add explicit flush to append_entry and strengthen commit ordering docs)
+2025-12-11 14:28:56 - REFINING REQUIREMENTS (new_requirements.md)
+2025-12-11 14:32:53 - GIT HEAD (1a13fc5345dec72b7b97dcb6a397ac0b06cba3a2)
+2025-12-11 14:32:58 - START IMPLEMENTING (current_requirements.md)
+<<
+  Verify planning mode retry logic and coach response parsing. Document retry config location in .g3.toml under
+  [agent] section (max_retry_attempts, autonomous_max_retry_attempts). Note RetryConfig in retry.rs uses hardcoded max 3.
+  Add retry config examples to config.example.toml. Manual verification: test network errors trigger retries with backoff.
+  Coach feedback extraction uses multiple sources (SessionLog, NativeToolCall, etc) - verify non-fallback extraction.
+  Optional: add integration test for retry + feedback flow if issues found during manual testing.
+>>
+2025-12-11 14:55:22 - COMPLETED REQUIREMENTS (completed_requirements_2025-12-11_14-55-22.md,  completed_todo_2025-12-11_14-55-22.md)
+2025-12-11 14:56:27 - GIT COMMIT (Document retry config location and verify planning mode logic)