Adds blank line separation between text and tool calls for better readability:
- Text → Tool: blank line before tool call
- Tool → Text: blank line before text
- Tool → Tool: no gap (stays tight)
Implemented via two state tracking flags in ConsoleUiWriter:
- last_output_was_text
- last_output_was_tool
Updated print_tool_output_header(), print_tool_compact(), and
print_agent_response() to check and set these flags appropriately.
- Move run_flock_mode() to autonomous.rs (parallel execution mode belongs with autonomous code)
- Move initialize_logging() to utils.rs (utility function with simple bool parameter)
- lib.rs reduced from 274 to 216 lines
No behavior changes. All 28 unit tests pass.
Agent: fowler
When compact tools (read_file, write_file, str_replace, etc.) failed,
they would fall through to the non-compact output path, causing:
- Missing or incorrect headers
- Stray footers with wrong formatting
- State leakage (is_shell_compact) between tool calls
Now failed compact tools display in the same single-line format as
successful ones, just with a truncated error message instead of the
success summary:
● read_file | path/to/file.txt | ❌ Failed to read file... | 123 ◉ 0ms
This keeps the UI consistent and avoids the "stray footer" bug.
The LLM often hallucinates incorrect paths like /Users/jnbrymn/GitHub/g3
when the actual working directory is different. This causes wasted tokens
and failed commands as the LLM tries cd commands that fail.
Fix: Add the current working directory to the combined project content
that gets injected into the context at startup. This appears as:
📂 Working Directory: /actual/path/to/workspace
This is prepended before AGENTS.md, README.md, and project memory,
so the LLM knows the correct path from the start.
This change removes the legacy logs/ directory and consolidates all
session data, error logs, and discovery files under the .g3/ directory.
New directory structure:
- .g3/sessions/<session_id>/session.json - session logs
- .g3/errors/ - error logs (was logs/errors/)
- .g3/background_processes/ - background process logs
- .g3/discovery/ - planner discovery files (was workspace/logs/)
Changes:
- paths.rs: Remove get_logs_dir()/logs_dir(), add get_errors_dir(),
get_background_processes_dir(), get_discovery_dir()
- session.rs: Anonymous sessions now use .g3/sessions/anonymous_<ts>/
- error_handling.rs: Errors now saved to .g3/errors/
- project.rs: Remove logs_dir() and ensure_logs_dir() methods
- feedback_extraction.rs: Remove logs_dir field and fallback logic
- planner: Use .g3/ for workspace data and .g3/discovery/ for reports
- flock.rs: Look for session metrics in .g3/sessions/
- coach_feedback.rs: Remove fallback to logs/ path
- Update all tests to use new paths
- Update README.md and .gitignore
Implement compact display format for read_file, write_file, str_replace, and shell:
- read_file/write_file/str_replace: Single line with dimmed summary and timing
Format: ● tool_name | path [range] | summary | tokens ◉ time
- shell: Two-line format with command header and dimmed output
Format: ● shell | command
└─ output (N lines) | tokens ◉ time
Changes:
- Add print_tool_compact() method to UiWriter trait
- Add is_shell_compact state tracking in ConsoleUiWriter
- Add format_write_file_summary() and format_str_replace_summary() helpers
- Fix duplicate response output by checking if response is empty before printing
- Add finish_streaming_markdown() call before return to flush markdown buffer
Agent: hopper
Added 4 new test files with blackbox/characterization-style integration tests:
- compaction_behavior_test.rs (14 tests): Token cap calculation, thinking mode
disable logic, summary message building, CompactionResult behavior
- retry_behavior_test.rs (17 tests): RetryConfig presets and customization,
RetryResult state handling, retry_operation behavior with simulated errors
- tool_execution_roundtrip_test.rs (16 tests): End-to-end tool execution through
Agent interface for read_file, write_file, shell, str_replace, and TODO tools
- error_classification_test.rs (25 tests): Recoverable vs non-recoverable error
classification, retry delay calculation, edge cases and priority handling
All tests follow integration-first philosophy:
- Test through stable public interfaces
- Assert observable behavior, not implementation details
- Use characterization style to document current behavior
- Enable refactoring by not encoding internal structure
Add ANSI clear-to-end-of-line escape sequence (\x1b[K]) after the
reset code in the context thinning animation. This prevents leftover
background color artifacts when the carriage return overwrites the
line during the flash animation.
Project memory is now stored at analysis/memory.md instead of .g3/memory.md.
This change enables:
- Shared memory across git worktrees (studio agent sessions)
- Version-controlled memory that persists across clones
- Memory changes tracked in git history and reviewable in PRs
Changes:
- crates/g3-core/src/tools/memory.rs: Update get_memory_path() to use analysis/
- crates/g3-cli/src/project_files.rs: Update read_project_memory() path
- crates/g3-core/src/prompts.rs: Update documentation references (2 occurrences)
- analysis/memory.md: Add memory file (copied from .g3/memory.md)
Changes:
- Fix JSON path for session logs: now reads from context_window.conversation_history
(with fallback to messages for backwards compatibility)
- Remove 500-character truncation to show full summary
- Add termimad dependency for terminal markdown rendering
- Display summary with proper markdown formatting (headers, bold, code, lists)
The extract_session_summary() function was looking for messages at the wrong
JSON path. Session logs store conversation history at context_window.conversation_history,
not at the top-level messages key.
Consolidate duplicated logic into canonical shared functions:
- Extract load_config_with_cli_overrides() to utils.rs
- Was duplicated in lib.rs and accumulative.rs with subtle differences
- lib.rs version had Chrome diagnostics + provider validation
- accumulative.rs version was missing both
- Now all callers use the complete canonical implementation
- Extract combine_project_content() to project_files.rs
- Was duplicated inline in lib.rs and agent_mode.rs
- Simplified implementation using iterator flatten
- Added unit tests for all cases
This eliminates drift risk where the duplicated implementations
could diverge over time (accumulative.rs was already missing
Chrome diagnostics and provider validation).
Agent: fowler
Studio enables running multiple g3 agents concurrently without conflicts
by using git worktrees for isolation.
Features:
- studio run --agent <name> [args...]: Create worktree, spawn g3, tail output
- studio list: Show all active sessions
- studio status <id>: Show session details and summary
- studio accept <id>: Merge session branch to main and cleanup
- studio discard <id>: Delete session without merging
Each session gets:
- Isolated worktree at .worktrees/sessions/<agent>/<session-id>
- Dedicated branch: sessions/<agent>/<session-id>
- Short UUID (8 chars) for easy reference
- Automatic --workspace and --agent flags passed to g3
Added first_user_message field to Fragment struct that captures the
full first user message (task) from the dehydrated conversation.
This is now displayed at the top of the stub with a 📋 Task: prefix.
Removed the Topics section from the stub since the full task provides
better context for forensics and debugging.
Agent: g3
The UTF-8 string slicing pattern is better suited as a remembered
pattern in project memory rather than a static AGENTS.md section.
This keeps AGENTS.md focused on codebase-specific invariants while
the pattern remains accessible for reference.
Agent: g3
ACD (Aggressive Context Dehydration) fixes:
- Fixed dehydrate_context() to extract turn summary from context window
instead of using the passed-in final_response (which contained only
the timing footer, not the actual LLM response)
- Removed final_response parameter from dehydrate_context() since it
now self-extracts the last assistant message as the summary
- This ensures the actual turn summary is preserved after dehydration,
not just the timing footer
New /dump command:
- Added /dump command to dump entire context window to tmp/ for debugging
- Shows message index, role, kind, content length, and full content
- Available in both console and machine modes
UTF-8 safety:
- Fixed truncate_to_word_boundary() to use character indices instead of
byte indices, preventing panics on multi-byte UTF-8 characters
- Added UTF-8 string slicing guidance to AGENTS.md
Agent: g3
When read_file is called with an end position beyond the file length,
instead of returning an error that forces a retry, now clamps to the
actual file length and returns the content with an informative message.
This eliminates wasteful retry cycles where the LLM had to make a
second request with the corrected end position.
Change read_file output format so the "🔍 N lines read" appears as
the last line after the file content, not before it. This keeps the
output cleaner with just one metadata line at the end.
1. str_replace: Show insertion/deletion counts with colors
"✅ +N insertions | -M deletions" (green/red)
2. write_file: Compact format with human-readable sizes
"✅ wrote N lines | Xk chars"
3. read_file: Cleaner format
"🔍 N lines read" instead of "📄 File content (N lines)"
4. webdriver_quit: Show correct driver name (safaridriver vs chromedriver)
5. read_file: When start position exceeds file length, read last 100 chars
with explanation instead of failing
6. shell: Remove redundant "Command failed:" prefix from error messages
When a code block ended without a trailing newline after the closing
\`\`\`, two bugs occurred in flush_incomplete():
1. The closing \`\`\` was included as part of the code block content
(displayed with syntax highlighting)
2. The same \`\`\` was then emitted again as literal text because
current_line was not cleared after being pushed to block_buffer
The fix:
- Check if current_line is the closing fence before adding to block_buffer
- Always clear current_line after processing in the CodeBlock case
Added two tests:
- test_code_fence_after_blank_line: code fence with trailing newline
- test_code_fence_no_trailing_newline: code fence without trailing newline
The agent mode header now shows:
- Agent name in uppercase with box art
- Working directory (truncated if too long)
- Status indicators for README, AGENTS.md, and Memory loading
- Task preview if provided
Also exports truncate_for_display and adds truncate_path_for_display
helper functions in project_files module.
The JSON tool call filter was outputting newlines immediately as they
were encountered. When the LLM output contained multiple newlines before
a tool call, each newline was output before the tool call JSON was
detected and suppressed, leaving orphaned blank lines in the output.
Changes:
- Add pending_newlines field to FilterState to buffer newlines at line start
- First newline after content is output immediately, subsequent ones buffered
- When tool call confirmed, pending_newlines cleared (suppressing extra blanks)
- When not a tool call, pending_newlines output with the buffer
- Add flush_json_tool_filter() to flush pending content at end of streaming
- Update tests to reflect new behavior
- Add tests for newline suppression behavior
Extract three cohesive modules from the monolithic lib.rs (3188 -> 2785 lines):
- metrics.rs (147 lines): Turn metrics tracking and histogram generation
- TurnMetrics struct
- format_elapsed_time() for human-readable durations
- generate_turn_histogram() for performance visualization
- Added unit tests for core functions
- project_files.rs (181 lines): Project file reading utilities
- read_agents_config() for AGENTS.md loading
- read_project_readme() for README detection
- read_project_memory() for .g3/memory.md
- extract_readme_heading() for display
- Added unit tests
- coach_feedback.rs (129 lines): Coach feedback extraction from session logs
- extract_from_logs() main entry point
- Helper functions for log parsing and text extraction
All modules have clear single responsibilities, improved documentation,
and maintain identical behavior to the original inline functions.
Agent: carmack
Collapsed nested if statements that check related conditions into
single conditions using &&. This improves readability by making
the logical relationship between conditions explicit.
Files changed:
- feedback_extraction.rs: 3 instances of tool_use/final_output checks
- tools/todo.rs: 1 instance of todo completion check
Agent: fowler
The function had two branches that both returned line.to_string():
- when !should_truncate
- when line.chars().count() <= max_width
Merged into a single condition. Also updated format! to use
inline variable syntax per clippy suggestion.
Agent: fowler
Agent mode was only loading README.md but not AGENTS.md or project
memory (.g3/memory.md). This meant agents were missing important
context that normal mode had access to.
Now agent mode uses the same read_agents_config(), read_project_readme(),
and read_project_memory() functions as normal mode, combining all three
into the agent context.
Simplify auto-memory by always enabling it in agent mode instead of
requiring the --auto-memory flag. This makes sense because:
- Agent mode is non-interactive, so blocking is acceptable
- Agents benefit from automatically saving discoveries to memory
- Reduces flag complexity for users
The --auto-memory flag still works for other modes if desired.
The --auto-memory flag was not being passed to run_agent_mode() and
send_auto_memory_reminder() was not being called after agent task
execution.
Changes:
- Pass auto_memory parameter to run_agent_mode()
- Add auto_memory parameter to run_agent_mode() function signature
- Call agent.set_auto_memory(true) when flag is enabled
- Call send_auto_memory_reminder() after execute_task() in agent mode
Adds a new --auto-memory CLI flag that automatically sends a reminder
to the LLM after each turn where tools were called, prompting it to
call the remember tool if it discovered any key code locations.
Changes:
- Add auto_memory field and set_auto_memory() method to Agent
- Add tool_calls_this_turn tracking in execute_tool_in_dir()
- Add send_auto_memory_reminder() that sends reminder after tool use
- Add --auto-memory CLI flag and wire it up in console/machine modes
- Call send_auto_memory_reminder() in single-shot and interactive modes
- Add visible status messages for auto-memory actions
Fixes bug where tool calls were not being tracked when execute_tool_in_dir
was called directly with working_dir=None.
The format_header() function was not calling format_inline_content()
to process inline formatting like **bold**, *italic*, and `code`
within headers. This caused raw markdown markers to appear in output.
Added 4 tests to verify the fix:
- test_bold_inside_header
- test_italic_inside_header
- test_code_inside_header
- test_mixed_formatting_inside_header
The code fence (```) was not being properly detected during streaming,
causing it to be rendered as inline code instead of a code block.
Root cause: When buffering a code fence after seeing ```, the code
was returning early for ALL characters including newlines. This meant
handle_newline() was never called and block_state was never set to
BlockState::CodeBlock.
Fixes:
- Don't return early for newlines when buffering code fence, allow them
to fall through to handle_newline()
- Support indented code fences (up to 3 spaces per CommonMark spec) by
using trim_start() when checking for ``` at line start