28 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
d3f0112f46 fix: store tool calls structurally for proper API roundtripping
The agent would stop mid-task because native tool calls were stored as
inline JSON text in Message.content. When sent back to the Anthropic API
via convert_messages(), they went as plain text instead of structured
tool_use/tool_result blocks. The model would occasionally get confused
and emit text describing what it wanted to do instead of invoking the
tool mechanism.

Changes:
- Add MessageToolCall struct and tool_calls/tool_result_id fields to Message
- Add id field to core ToolCall struct to preserve provider tool call IDs
- Update Anthropic convert_messages() to emit tool_use and tool_result blocks
- Add ToolResult variant to AnthropicContent enum
- Store tool calls structurally in tool message construction (not inline JSON)
- Fix add_message() to preserve empty-content messages with tool_calls
- Fix check_duplicate_in_previous_message() to check structured tool_calls
- Generate valid IDs for JSON fallback tool calls (Anthropic pattern requirement)
- Update planner create_tool_message() to use structured tool calls
2026-02-11 08:48:07 +11:00
Dhanji R. Prasanna
a93ce932a3 refactor: Clean up Cargo dependencies - remove unused, update outdated
- Remove unused const_format from g3-planner (never imported)
- Remove unused thiserror from workspace and 5 crates (declared but never used)
- Update termimad 0.31 -> 0.34 in studio (consistency with g3-cli)
- Update indicatif 0.17 -> 0.18 in g3-cli
- Update ratatui 0.29 -> 0.30 in g3-cli
- Update walkdir 2.4 -> 2.5 in g3-core
- Update image 0.24 -> 0.25 in g3-computer-control (macOS + Linux)
- Update config 0.14 -> 0.15 in workspace

Blocked: reqwest 0.11 -> 0.12/0.13 requires breaking API changes to
bytes_stream() used in 4 providers - needs separate migration effort.

All tests pass. No behavior changes.

Agent: fowler
2026-02-06 14:22:59 +11:00
Dhanji R. Prasanna
7bfb9efa19 Remove automatic README loading from context window
README.md is no longer auto-loaded into the LLM context at startup.
This saves ~4,600 tokens per session while AGENTS.md and memory.md
still provide all critical information for code tasks.

Changes:
- Delete read_project_readme() function
- Remove readme_content parameter from combine_project_content()
- Rename extract_readme_heading() -> extract_project_heading()
- Rename Agent constructors: *_with_readme_* -> *_with_project_context_*
- Update context preservation to only check for Agent Configuration
- Remove has_readme field from LoadedContent
- Update all tests to use new markers and function names

The LLM can still read README.md on-demand via read_file when needed.
2026-01-29 11:07:41 +11:00
Dhanji R. Prasanna
182f5f98fe Centralize g3 status message formatting
Extract a new g3_status module in g3-cli that provides consistent formatting
for all 'g3:' prefixed system status messages.

Key changes:
- Add G3Status struct with methods for progress, done, failed, error, etc.
- Add Status enum with Done, Failed, Error, Resolved, Insufficient, NoChanges
- Add ThinResult struct in g3-core for semantic thinning data
- Update UiWriter trait with print_thin_result() method
- Refactor context thinning to return ThinResult instead of formatted strings
- Update all callers to use the new centralized formatting
- Session resume/decline messages now use G3Status
- Compaction status messages now use G3Status

This maintains clean separation of concerns: g3-core emits semantic data,
g3-cli handles all terminal formatting and colors.
2026-01-20 09:50:55 +05:30
Dhanji R. Prasanna
0e33465342 Add print_g3_progress/print_g3_status methods for consistent status messages 2026-01-16 20:28:24 +05:30
Dhanji R. Prasanna
0ae1a13cdb feat: real-time tool call streaming indicator with blinking UI
- Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback
- New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active()
- Refactor ConsoleUiWriter state to use atomics in ParsingHintState
- Add tool_call_streaming field to CompletionChunk for provider hints
- Anthropic provider sends streaming hints when tool name detected
- New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active()

Parser improvements:
- Add is_json_invalidated() to detect false positive tool patterns
- Fix tool result poisoning when file contents contain partial JSON
- Unescaped newlines in strings or prose after JSON invalidates detection

User sees ' ● tool_name |' immediately when tool call starts streaming,
with blinking indicator while args are received.
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
f30f145c85 Fix UTF-8 panics and inconsistent retry logic
- Fix 7 UTF-8 byte slicing panics that crash on multi-byte characters:
  - acd.rs: extract_topic_from_text() [..50] slice
  - streaming.rs: log_stream_error() [..500] slice
  - tools/acd.rs: rehydrate message truncation [..2000] slice
  - history.rs: git commit message truncation [..69] slice
  - planner.rs: commit summary/description truncation [..69] slices
  - llm.rs: requirements summary line truncation [..117] slice

- All now use chars().count() and chars().take(N).collect() for
  UTF-8 safe truncation

- Fix inconsistent retry logic in task_execution.rs:
  - Previously only retried on Timeout errors
  - Now retries on ALL recoverable errors (rate limits, network,
    server errors, model busy, token limits, context length)
  - Added error-specific base delays (rate limit: 5s, server: 2s, etc.)
  - Added exponential backoff with ±20% jitter
  - Consistent with autonomous mode retry behavior
2026-01-13 05:49:45 +05:30
Dhanji R. Prasanna
c2aa80647a Remove legacy logs/ directory, consolidate all data under .g3/
This change removes the legacy logs/ directory and consolidates all
session data, error logs, and discovery files under the .g3/ directory.

New directory structure:
- .g3/sessions/<session_id>/session.json - session logs
- .g3/errors/ - error logs (was logs/errors/)
- .g3/background_processes/ - background process logs
- .g3/discovery/ - planner discovery files (was workspace/logs/)

Changes:
- paths.rs: Remove get_logs_dir()/logs_dir(), add get_errors_dir(),
  get_background_processes_dir(), get_discovery_dir()
- session.rs: Anonymous sessions now use .g3/sessions/anonymous_<ts>/
- error_handling.rs: Errors now saved to .g3/errors/
- project.rs: Remove logs_dir() and ensure_logs_dir() methods
- feedback_extraction.rs: Remove logs_dir field and fallback logic
- planner: Use .g3/ for workspace data and .g3/discovery/ for reports
- flock.rs: Look for session metrics in .g3/sessions/
- coach_feedback.rs: Remove fallback to logs/ path
- Update all tests to use new paths
- Update README.md and .gitignore
2026-01-12 18:20:08 +05:30
Dhanji R. Prasanna
0aa1287ca6 Remove final_output tool and improve scout report handback
final_output removal:
- Remove final_output from tool definitions and dispatch
- Update system prompts to request summaries as regular text
- Remove final_output_called field from StreamingState
- Update auto_continue tests to remove final_output_called parameter
- Remove final_output test from tool_execution_test.rs
- Update planner and flock prompts to not reference final_output
- Keep backwards-compat code in feedback_extraction.rs and task_result.rs

Scout report handback:
- Change from file-based to delimiter-based report extraction
- Scout outputs report between ---SCOUT_REPORT_START/END--- markers
- Research tool extracts content between markers, strips ANSI codes
- Add comprehensive tests for extraction and ANSI stripping

657 tests pass.
2026-01-10 13:43:04 +11:00
Dhanji R. Prasanna
777191b3cb Remove final_output tool - let summaries stream naturally
- Remove final_output from tool definitions, dispatch, and misc tools
- Update system prompts to request summaries as regular markdown text
- Remove print_final_output from UiWriter trait and all implementations
- Remove final_output handling from agent core logic
- Rename final_output_summary → summary in session continuation
- Delete final_output test files
- Update tool count tests (12→11, 27→26)

This allows LLM summaries to stream through the markdown formatter
for a more natural, responsive user experience instead of buffering
everything into a tool call.
2026-01-09 14:57:24 +11:00
Dhanji R. Prasanna
3ece02ff31 fix: resolve compiler warnings across crates
- Remove unused assignment to final_output_called (returns immediately after)
- Mark cache_config field as #[allow(dead_code)] (reserved for future use)
- Mark print_status_line method as #[allow(dead_code)] (reserved for future use)
2025-12-25 18:47:22 +11:00
Dhanji R. Prasanna
cd64ebbf87 Add tokens consumed and context percentage to per-tool timing footer
The per-tool timing line now shows:
- Tokens delta (tokens added to context by this tool call)
- Context window usage percentage

Example: └─ ️ 1ms  523tk | 49% ctx

Changes:
- Updated UiWriter trait print_tool_timing signature
- Track tokens before/after adding tool messages to calculate delta
- Updated ConsoleUiWriter, MachineUiWriter, PlannerUiWriter, and test mocks
2025-12-24 15:44:19 +11:00
Jochen
7b47495881 Document retry config location and verify planning mode logic
Add documentation for retry configuration in planning mode:
- Document retry settings in .g3.toml under [agent] section
- Note RetryConfig implementation in g3-core/src/retry.rs
- Clarify hardcoded vs config-based retry values

Verify existing retry loop and coach feedback parsing:
- Confirm execute_with_retry() handles recoverable errors
- Document feedback extraction source priority order
- Provide manual verification steps for testing
2025-12-11 14:56:27 +11:00
Jochen
1a13fc5345 Add explicit flush to append_entry and strengthen commit ordering docs
Add file.flush() call in append_entry() to ensure planner history
entries are written to disk before git commits execute. While the
file handle drop should flush, explicit flush simplifies reasoning
about the ordering invariant.

Extend code comments in stage_and_commit() to document that the
write_git_commit-before-git::commit ordering has regressed multiple
times and must be preserved in any refactoring.

Requirements: completed_requirements_2025-12-11_10-05-08.md
2025-12-11 10:05:39 +11:00
Jochen
b3ac7746b9 Preserve planner history ordering and add regression guardrails
Ensure planner writes GIT COMMIT entry before invoking git commit.
Keep history entry even when git commit fails, matching summary text.
Document invariant in code comment above write_git_commit call.
Add lightweight test to assert history write precedes git::commit using
test doubles instead of a real git repository.
Investigate git history to find regression and its prior fix, and
record a short root-cause summary outside the codebase.
Reference completed_requirements_2025-12-10_16-55-05.md for details.
Reference completed_todo_2025-12-10_16-55-05.md for task tracking.
2025-12-10 16:55:24 +11:00
Jochen
5f3a2a4203 remove debug statements 2025-12-10 16:26:59 +11:00
Jochen
87bceba54f Fix planner UI whitespace and workspace logs directory
Resolve two critical issues in planner mode that persisted through
multiple fix attempts:

1. Remove excessive whitespace between tool call displays by replacing
   direct println!() calls with ui_writer methods and eliminating
   redundant newlines in agent response streaming.

2. Ensure all log files (errors, sessions, tool calls, context dumps)
   are written to <workspace>/logs instead of codepath by properly
   initializing G3_WORKSPACE_PATH from --workspace argument.
2025-12-10 16:18:49 +11:00
Jochen
a03a432963 another attempt :/ 2025-12-10 11:29:10 +11:00
Jochen
75aa2d983e Refine planner mode UI and error handling
Improve planner mode user experience with better error reporting,
cleaner tool output, and consistent log file placement.

- Propagate and display classified LLM errors to users with
  appropriate icons and context
- Display tool calls on single lines with truncated arguments
- Show LLM text responses without overwriting via UiWriter
- Ensure all logs write to workspace/logs directory consistently
- Set G3_WORKSPACE_PATH early in planning mode initialization
2025-12-09 22:44:00 +11:00
Jochen
a9dbe5f7d3 some manual fixes after rebase 2025-12-09 17:11:19 +11:00
Jochen
633da0d8a6 Refine planner mode UI, logging, and history tracking
- Display coach feedback content (up to 25 lines) instead of just length
- Write GIT COMMIT entry to history before actual commit for better a...
- Implement single-line status updates during LLM processing with too...
- Display non-tool LLM text responses in planner UI
- Redirect all logs to <workspace>/logs directory instead of codepath
- Preserve TODO file in planner mode for history (prevent deletion)

Completed files:
- completed_requirements_2025-12-09_16-16-51.md
- completed_todo_2025-12-09_16-16-51.md
2025-12-09 17:03:53 +11:00
Jochen
ff8b3e7c7b Implement planning mode 2025-12-09 17:03:53 +11:00
Jochen
4aa84e2144 disable thinking if there is no token budget 2025-12-09 16:45:28 +11:00
Jochen
52f78653b4 add context window monitor
Writes the current context window to logs/current_context_window (uses a symlink to a session ID).

This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.
2025-11-27 21:00:02 +11:00
Jochen
1e1702001c Add logging for discovery 2025-11-26 10:41:35 +11:00
Jochen
c419833ddf updated the prompt 2025-11-26 10:26:52 +11:00
Jochen
c19127f809 make sure user requirements are included 2025-11-26 10:26:52 +11:00
Jochen
ad198a8501 add code exploration fast start
This tries to short-circuit multiple round-trips to llm for reading code.
It's a precursor to trying to context engineer tailored to specific tasks.
In initial experiments, it's only marginally faster than regular mode, and burns more tokens.
2025-11-25 22:51:32 +11:00