g3/g3-plan/completed_requirements_2025-12-11_14-55-22.md at 1b26de6cd2170be8495593efa888a1776311f452

alex/g3

Files

Jochen 7b47495881 Document retry config location and verify planning mode logic

Add documentation for retry configuration in planning mode:
- Document retry settings in .g3.toml under [agent] section
- Note RetryConfig implementation in g3-core/src/retry.rs
- Clarify hardcoded vs config-based retry values

Verify existing retry loop and coach feedback parsing:
- Confirm execute_with_retry() handles recoverable errors
- Document feedback extraction source priority order
- Provide manual verification steps for testing

2025-12-11 14:56:27 +11:00

5.0 KiB

Raw Blame History

These requirements specify verification tasks for the planning mode's retry logic and coach response parsing, along with documentation of where configuration is located.

1. Document Retry Configuration Location

Goal: Clarify where retry settings are configured for planning mode.

Findings to document:

Retry configuration is in the .g3.toml config file (or config.example.toml as template) under the [agent] section:

[agent]
max_retry_attempts = 3              # Default mode retries
autonomous_max_retry_attempts = 6   # Used by planning/autonomous mode

The retry infrastructure is implemented in crates/g3-core/src/retry.rs:
- RetryConfig struct defines retry behavior per role
- RetryConfig::planning("player") and RetryConfig::planning("coach") create presets
- Default max retries is 3 (hardcoded in RetryConfig::planning())
Note: Currently RetryConfig::planning() uses a hardcoded max_retries: 3 rather than reading from the config file's autonomous_max_retry_attempts. This may be intentional or a gap to address.

Required action:

add examples to config.example.toml for the coach and player retry configs.

2. Verify Retry Loop Functionality

Goal: Confirm that connection retry loops in planning mode work correctly for recoverable errors.

Verification approach:

The retry logic is implemented in g3_core::retry::execute_with_retry() and is already used by both player and coach phases in run_coach_player_loop() (planner.rs lines 633-640 and 682-689).
Error classification happens in g3_core::error_handling::classify_error() which identifies:
- RecoverableError::RateLimit (429 errors)
- RecoverableError::NetworkError (connection failures)
- RecoverableError::ServerError (5xx errors)
- RecoverableError::Timeout (request timeouts)
- RecoverableError::ModelBusy (capacity issues)
Manual verification steps (for a human tester):
- Run planning mode with a temporarily invalid API endpoint to trigger network errors
- Observe retry messages: "⚠️ player error (attempt X/3): NetworkError - ..."
- Observe backoff: "🔄 Retrying player in Xs..."
- After max retries, observe: "🔄 Max retries (3) reached for player"
Existing test coverage:
- g3-core/src/retry.rs has unit tests for RetryConfig construction
- g3-core/src/error_handling.rs has tests for classify_error() and delay calculations

Required action:

No code changes needed if retry loops are already functioning.
If issues are found during manual verification, document specific failure scenarios.

3. Verify Coach Response Parsing

Goal: Confirm that coach feedback extraction works correctly in planning mode.

Current implementation:

Coach feedback extraction uses g3_core::feedback_extraction::extract_coach_feedback() (called at planner.rs ~line 695).
The extraction tries multiple sources in order:
- FeedbackSource::SessionLog - from session log JSON file
- FeedbackSource::NativeToolCall - from native tool call JSON in response
- FeedbackSource::ConversationHistory - from conversation history
- FeedbackSource::TaskResultResponse - from TaskResult parsing
- FeedbackSource::DefaultFallback - default message

Planning mode displays the extraction source:

📝 Coach feedback extracted from SessionLog: 1234 chars

Verification approach:

Manual verification steps:
- Run a planning mode session through at least one coach/player cycle
- Observe the feedback extraction message and confirm it shows a valid source (preferably SessionLog or NativeToolCall, not DefaultFallback)
- Verify the first 25 lines of feedback are displayed correctly
- Confirm IMPLEMENTATION_APPROVED detection works when coach approves
Existing test coverage:
- g3-core/src/feedback_extraction.rs has comprehensive unit tests:
  - test_extract_balanced_json_* - JSON parsing
  - test_try_extract_json_tool_call - tool call extraction
  - test_is_final_output_tool_call_* - detecting final_output calls
  - test_extracted_feedback_is_approved - approval detection

Required action:

No code changes needed if parsing is working correctly.
If DefaultFallback is observed frequently during manual testing, investigate why earlier extraction methods are failing and document findings.

4. Optional: Add Integration Test for Retry + Feedback Flow

Goal: Create a lightweight integration test that verifies the retry and feedback extraction machinery works together.

Scope: Only implement if time permits and manual verification reveals issues.

Approach:

Create a test in crates/g3-planner/tests/ that:
- Mocks an LLM provider that returns a final_output tool call
- Verifies extract_coach_feedback() successfully extracts the feedback
- Optionally simulates a recoverable error to test retry logic
This test should NOT require actual API calls or network access.

5.0 KiB Raw Blame History

1. Document Retry Configuration Location

2. Verify Retry Loop Functionality

3. Verify Coach Response Parsing

4. Optional: Add Integration Test for Retry + Feedback Flow

5.0 KiB

Raw Blame History