g3/g3-plan/completed_requirements_2025-12-11_14-55-22.md

{{CURRENT REQUIREMENTS}}

These requirements specify verification tasks for the planning mode's retry logic and coach
response parsing, along with documentation of where configuration is located.

## 1. Document Retry Configuration Location

**Goal**: Clarify where retry settings are configured for planning mode.

**Findings to document**:
1. Retry configuration is in the `.g3.toml` config file (or `config.example.toml` as template)
   under the `[agent]` section:
   ```toml
   [agent]
   max_retry_attempts = 3              # Default mode retries
   autonomous_max_retry_attempts = 6   # Used by planning/autonomous mode
   ```

2. The retry infrastructure is implemented in `crates/g3-core/src/retry.rs`:
   - `RetryConfig` struct defines retry behavior per role
   - `RetryConfig::planning("player")` and `RetryConfig::planning("coach")` create presets
   - Default max retries is 3 (hardcoded in `RetryConfig::planning()`)

3. **Note**: Currently `RetryConfig::planning()` uses a hardcoded `max_retries: 3` rather than
   reading from the config file's `autonomous_max_retry_attempts`. This may be intentional or
   a gap to address.

**Required action**:

- add examples to config.example.toml for the coach and player retry configs.

## 2. Verify Retry Loop Functionality

**Goal**: Confirm that connection retry loops in planning mode work correctly for recoverable
errors.

**Verification approach**:
1. The retry logic is implemented in `g3_core::retry::execute_with_retry()` and is already
   used by both player and coach phases in `run_coach_player_loop()` (planner.rs lines 633-640
   and 682-689).

2. Error classification happens in `g3_core::error_handling::classify_error()` which identifies:
   - `RecoverableError::RateLimit` (429 errors)
   - `RecoverableError::NetworkError` (connection failures)
   - `RecoverableError::ServerError` (5xx errors)
   - `RecoverableError::Timeout` (request timeouts)
   - `RecoverableError::ModelBusy` (capacity issues)

3. **Manual verification steps** (for a human tester):
   - Run planning mode with a temporarily invalid API endpoint to trigger network errors
   - Observe retry messages: `"⚠️ player error (attempt X/3): NetworkError - ..."`
   - Observe backoff: `"🔄 Retrying player in Xs..."`
   - After max retries, observe: `"🔄 Max retries (3) reached for player"`

4. **Existing test coverage**:
   - `g3-core/src/retry.rs` has unit tests for `RetryConfig` construction
   - `g3-core/src/error_handling.rs` has tests for `classify_error()` and delay calculations

**Required action**:
- No code changes needed if retry loops are already functioning.
- If issues are found during manual verification, document specific failure scenarios.

## 3. Verify Coach Response Parsing

**Goal**: Confirm that coach feedback extraction works correctly in planning mode.

**Current implementation**:
1. Coach feedback extraction uses `g3_core::feedback_extraction::extract_coach_feedback()`
   (called at planner.rs ~line 695).

2. The extraction tries multiple sources in order:
   - `FeedbackSource::SessionLog` - from session log JSON file
   - `FeedbackSource::NativeToolCall` - from native tool call JSON in response
   - `FeedbackSource::ConversationHistory` - from conversation history
   - `FeedbackSource::TaskResultResponse` - from TaskResult parsing
   - `FeedbackSource::DefaultFallback` - default message

3. Planning mode displays the extraction source:
   ```
   📝 Coach feedback extracted from SessionLog: 1234 chars
   ```

**Verification approach**:
1. **Manual verification steps**:
   - Run a planning mode session through at least one coach/player cycle
   - Observe the feedback extraction message and confirm it shows a valid source
     (preferably `SessionLog` or `NativeToolCall`, not `DefaultFallback`)
   - Verify the first 25 lines of feedback are displayed correctly
   - Confirm `IMPLEMENTATION_APPROVED` detection works when coach approves

2. **Existing test coverage**:
   - `g3-core/src/feedback_extraction.rs` has comprehensive unit tests:
     - `test_extract_balanced_json_*` - JSON parsing
     - `test_try_extract_json_tool_call` - tool call extraction
     - `test_is_final_output_tool_call_*` - detecting final_output calls
     - `test_extracted_feedback_is_approved` - approval detection

**Required action**:
- No code changes needed if parsing is working correctly.
- If `DefaultFallback` is observed frequently during manual testing, investigate why
  earlier extraction methods are failing and document findings.

## 4. Optional: Add Integration Test for Retry + Feedback Flow

**Goal**: Create a lightweight integration test that verifies the retry and feedback
extraction machinery works together.

**Scope**: Only implement if time permits and manual verification reveals issues.

**Approach**:
1. Create a test in `crates/g3-planner/tests/` that:
   - Mocks an LLM provider that returns a `final_output` tool call
   - Verifies `extract_coach_feedback()` successfully extracts the feedback
   - Optionally simulates a recoverable error to test retry logic

2. This test should NOT require actual API calls or network access.