- Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback
- New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active()
- Refactor ConsoleUiWriter state to use atomics in ParsingHintState
- Add tool_call_streaming field to CompletionChunk for provider hints
- Anthropic provider sends streaming hints when tool name detected
- New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active()
Parser improvements:
- Add is_json_invalidated() to detect false positive tool patterns
- Fix tool result poisoning when file contents contain partial JSON
- Unescaped newlines in strings or prose after JSON invalidates detection
User sees ' ● tool_name |' immediately when tool call starts streaming,
with blinking indicator while args are received.
This change removes the legacy logs/ directory and consolidates all
session data, error logs, and discovery files under the .g3/ directory.
New directory structure:
- .g3/sessions/<session_id>/session.json - session logs
- .g3/errors/ - error logs (was logs/errors/)
- .g3/background_processes/ - background process logs
- .g3/discovery/ - planner discovery files (was workspace/logs/)
Changes:
- paths.rs: Remove get_logs_dir()/logs_dir(), add get_errors_dir(),
get_background_processes_dir(), get_discovery_dir()
- session.rs: Anonymous sessions now use .g3/sessions/anonymous_<ts>/
- error_handling.rs: Errors now saved to .g3/errors/
- project.rs: Remove logs_dir() and ensure_logs_dir() methods
- feedback_extraction.rs: Remove logs_dir field and fallback logic
- planner: Use .g3/ for workspace data and .g3/discovery/ for reports
- flock.rs: Look for session metrics in .g3/sessions/
- coach_feedback.rs: Remove fallback to logs/ path
- Update all tests to use new paths
- Update README.md and .gitignore
final_output removal:
- Remove final_output from tool definitions and dispatch
- Update system prompts to request summaries as regular text
- Remove final_output_called field from StreamingState
- Update auto_continue tests to remove final_output_called parameter
- Remove final_output test from tool_execution_test.rs
- Update planner and flock prompts to not reference final_output
- Keep backwards-compat code in feedback_extraction.rs and task_result.rs
Scout report handback:
- Change from file-based to delimiter-based report extraction
- Scout outputs report between ---SCOUT_REPORT_START/END--- markers
- Research tool extracts content between markers, strips ANSI codes
- Add comprehensive tests for extraction and ANSI stripping
657 tests pass.
- Remove final_output from tool definitions, dispatch, and misc tools
- Update system prompts to request summaries as regular markdown text
- Remove print_final_output from UiWriter trait and all implementations
- Remove final_output handling from agent core logic
- Rename final_output_summary → summary in session continuation
- Delete final_output test files
- Update tool count tests (12→11, 27→26)
This allows LLM summaries to stream through the markdown formatter
for a more natural, responsive user experience instead of buffering
everything into a tool call.
- Remove unused assignment to final_output_called (returns immediately after)
- Mark cache_config field as #[allow(dead_code)] (reserved for future use)
- Mark print_status_line method as #[allow(dead_code)] (reserved for future use)
Add file.flush() call in append_entry() to ensure planner history
entries are written to disk before git commits execute. While the
file handle drop should flush, explicit flush simplifies reasoning
about the ordering invariant.
Extend code comments in stage_and_commit() to document that the
write_git_commit-before-git::commit ordering has regressed multiple
times and must be preserved in any refactoring.
Requirements: completed_requirements_2025-12-11_10-05-08.md
Ensure planner writes GIT COMMIT entry before invoking git commit.
Keep history entry even when git commit fails, matching summary text.
Document invariant in code comment above write_git_commit call.
Add lightweight test to assert history write precedes git::commit using
test doubles instead of a real git repository.
Investigate git history to find regression and its prior fix, and
record a short root-cause summary outside the codebase.
Reference completed_requirements_2025-12-10_16-55-05.md for details.
Reference completed_todo_2025-12-10_16-55-05.md for task tracking.
Resolve two critical issues in planner mode that persisted through
multiple fix attempts:
1. Remove excessive whitespace between tool call displays by replacing
direct println!() calls with ui_writer methods and eliminating
redundant newlines in agent response streaming.
2. Ensure all log files (errors, sessions, tool calls, context dumps)
are written to <workspace>/logs instead of codepath by properly
initializing G3_WORKSPACE_PATH from --workspace argument.
Improve planner mode user experience with better error reporting,
cleaner tool output, and consistent log file placement.
- Propagate and display classified LLM errors to users with
appropriate icons and context
- Display tool calls on single lines with truncated arguments
- Show LLM text responses without overwriting via UiWriter
- Ensure all logs write to workspace/logs directory consistently
- Set G3_WORKSPACE_PATH early in planning mode initialization
- Display coach feedback content (up to 25 lines) instead of just length
- Write GIT COMMIT entry to history before actual commit for better a...
- Implement single-line status updates during LLM processing with too...
- Display non-tool LLM text responses in planner UI
- Redirect all logs to <workspace>/logs directory instead of codepath
- Preserve TODO file in planner mode for history (prevent deletion)
Completed files:
- completed_requirements_2025-12-09_16-16-51.md
- completed_todo_2025-12-09_16-16-51.md
Writes the current context window to logs/current_context_window (uses a symlink to a session ID).
This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.
This tries to short-circuit multiple round-trips to llm for reading code.
It's a precursor to trying to context engineer tailored to specific tasks.
In initial experiments, it's only marginally faster than regular mode, and burns more tokens.