Files
g3/g3-plan/completed_requirements_2025-12-11_14-55-22.md
Jochen 7b47495881 Document retry config location and verify planning mode logic
Add documentation for retry configuration in planning mode:
- Document retry settings in .g3.toml under [agent] section
- Note RetryConfig implementation in g3-core/src/retry.rs
- Clarify hardcoded vs config-based retry values

Verify existing retry loop and coach feedback parsing:
- Confirm execute_with_retry() handles recoverable errors
- Document feedback extraction source priority order
- Provide manual verification steps for testing
2025-12-11 14:56:27 +11:00

5.0 KiB

{{CURRENT REQUIREMENTS}}

These requirements specify verification tasks for the planning mode's retry logic and coach response parsing, along with documentation of where configuration is located.

1. Document Retry Configuration Location

Goal: Clarify where retry settings are configured for planning mode.

Findings to document:

  1. Retry configuration is in the .g3.toml config file (or config.example.toml as template) under the [agent] section:

    [agent]
    max_retry_attempts = 3              # Default mode retries
    autonomous_max_retry_attempts = 6   # Used by planning/autonomous mode
    
  2. The retry infrastructure is implemented in crates/g3-core/src/retry.rs:

    • RetryConfig struct defines retry behavior per role
    • RetryConfig::planning("player") and RetryConfig::planning("coach") create presets
    • Default max retries is 3 (hardcoded in RetryConfig::planning())
  3. Note: Currently RetryConfig::planning() uses a hardcoded max_retries: 3 rather than reading from the config file's autonomous_max_retry_attempts. This may be intentional or a gap to address.

Required action:

  • add examples to config.example.toml for the coach and player retry configs.

2. Verify Retry Loop Functionality

Goal: Confirm that connection retry loops in planning mode work correctly for recoverable errors.

Verification approach:

  1. The retry logic is implemented in g3_core::retry::execute_with_retry() and is already used by both player and coach phases in run_coach_player_loop() (planner.rs lines 633-640 and 682-689).

  2. Error classification happens in g3_core::error_handling::classify_error() which identifies:

    • RecoverableError::RateLimit (429 errors)
    • RecoverableError::NetworkError (connection failures)
    • RecoverableError::ServerError (5xx errors)
    • RecoverableError::Timeout (request timeouts)
    • RecoverableError::ModelBusy (capacity issues)
  3. Manual verification steps (for a human tester):

    • Run planning mode with a temporarily invalid API endpoint to trigger network errors
    • Observe retry messages: "⚠️ player error (attempt X/3): NetworkError - ..."
    • Observe backoff: "🔄 Retrying player in Xs..."
    • After max retries, observe: "🔄 Max retries (3) reached for player"
  4. Existing test coverage:

    • g3-core/src/retry.rs has unit tests for RetryConfig construction
    • g3-core/src/error_handling.rs has tests for classify_error() and delay calculations

Required action:

  • No code changes needed if retry loops are already functioning.
  • If issues are found during manual verification, document specific failure scenarios.

3. Verify Coach Response Parsing

Goal: Confirm that coach feedback extraction works correctly in planning mode.

Current implementation:

  1. Coach feedback extraction uses g3_core::feedback_extraction::extract_coach_feedback() (called at planner.rs ~line 695).

  2. The extraction tries multiple sources in order:

    • FeedbackSource::SessionLog - from session log JSON file
    • FeedbackSource::NativeToolCall - from native tool call JSON in response
    • FeedbackSource::ConversationHistory - from conversation history
    • FeedbackSource::TaskResultResponse - from TaskResult parsing
    • FeedbackSource::DefaultFallback - default message
  3. Planning mode displays the extraction source:

    📝 Coach feedback extracted from SessionLog: 1234 chars
    

Verification approach:

  1. Manual verification steps:

    • Run a planning mode session through at least one coach/player cycle
    • Observe the feedback extraction message and confirm it shows a valid source (preferably SessionLog or NativeToolCall, not DefaultFallback)
    • Verify the first 25 lines of feedback are displayed correctly
    • Confirm IMPLEMENTATION_APPROVED detection works when coach approves
  2. Existing test coverage:

    • g3-core/src/feedback_extraction.rs has comprehensive unit tests:
      • test_extract_balanced_json_* - JSON parsing
      • test_try_extract_json_tool_call - tool call extraction
      • test_is_final_output_tool_call_* - detecting final_output calls
      • test_extracted_feedback_is_approved - approval detection

Required action:

  • No code changes needed if parsing is working correctly.
  • If DefaultFallback is observed frequently during manual testing, investigate why earlier extraction methods are failing and document findings.

4. Optional: Add Integration Test for Retry + Feedback Flow

Goal: Create a lightweight integration test that verifies the retry and feedback extraction machinery works together.

Scope: Only implement if time permits and manual verification reveals issues.

Approach:

  1. Create a test in crates/g3-planner/tests/ that:

    • Mocks an LLM provider that returns a final_output tool call
    • Verifies extract_coach_feedback() successfully extracts the feedback
    • Optionally simulates a recoverable error to test retry logic
  2. This test should NOT require actual API calls or network access.