Changes:
- Fix JSON path for session logs: now reads from context_window.conversation_history
(with fallback to messages for backwards compatibility)
- Remove 500-character truncation to show full summary
- Add termimad dependency for terminal markdown rendering
- Display summary with proper markdown formatting (headers, bold, code, lists)
The extract_session_summary() function was looking for messages at the wrong
JSON path. Session logs store conversation history at context_window.conversation_history,
not at the top-level messages key.
Studio enables running multiple g3 agents concurrently without conflicts
by using git worktrees for isolation.
Features:
- studio run --agent <name> [args...]: Create worktree, spawn g3, tail output
- studio list: Show all active sessions
- studio status <id>: Show session details and summary
- studio accept <id>: Merge session branch to main and cleanup
- studio discard <id>: Delete session without merging
Each session gets:
- Isolated worktree at .worktrees/sessions/<agent>/<session-id>
- Dedicated branch: sessions/<agent>/<session-id>
- Short UUID (8 chars) for easy reference
- Automatic --workspace and --agent flags passed to g3
Added first_user_message field to Fragment struct that captures the
full first user message (task) from the dehydrated conversation.
This is now displayed at the top of the stub with a 📋 Task: prefix.
Removed the Topics section from the stub since the full task provides
better context for forensics and debugging.
Agent: g3
The UTF-8 string slicing pattern is better suited as a remembered
pattern in project memory rather than a static AGENTS.md section.
This keeps AGENTS.md focused on codebase-specific invariants while
the pattern remains accessible for reference.
Agent: g3
ACD (Aggressive Context Dehydration) fixes:
- Fixed dehydrate_context() to extract turn summary from context window
instead of using the passed-in final_response (which contained only
the timing footer, not the actual LLM response)
- Removed final_response parameter from dehydrate_context() since it
now self-extracts the last assistant message as the summary
- This ensures the actual turn summary is preserved after dehydration,
not just the timing footer
New /dump command:
- Added /dump command to dump entire context window to tmp/ for debugging
- Shows message index, role, kind, content length, and full content
- Available in both console and machine modes
UTF-8 safety:
- Fixed truncate_to_word_boundary() to use character indices instead of
byte indices, preventing panics on multi-byte UTF-8 characters
- Added UTF-8 string slicing guidance to AGENTS.md
Agent: g3
When read_file is called with an end position beyond the file length,
instead of returning an error that forces a retry, now clamps to the
actual file length and returns the content with an informative message.
This eliminates wasteful retry cycles where the LLM had to make a
second request with the corrected end position.
Change read_file output format so the "🔍 N lines read" appears as
the last line after the file content, not before it. This keeps the
output cleaner with just one metadata line at the end.
1. str_replace: Show insertion/deletion counts with colors
"✅ +N insertions | -M deletions" (green/red)
2. write_file: Compact format with human-readable sizes
"✅ wrote N lines | Xk chars"
3. read_file: Cleaner format
"🔍 N lines read" instead of "📄 File content (N lines)"
4. webdriver_quit: Show correct driver name (safaridriver vs chromedriver)
5. read_file: When start position exceeds file length, read last 100 chars
with explanation instead of failing
6. shell: Remove redundant "Command failed:" prefix from error messages
When a code block ended without a trailing newline after the closing
\`\`\`, two bugs occurred in flush_incomplete():
1. The closing \`\`\` was included as part of the code block content
(displayed with syntax highlighting)
2. The same \`\`\` was then emitted again as literal text because
current_line was not cleared after being pushed to block_buffer
The fix:
- Check if current_line is the closing fence before adding to block_buffer
- Always clear current_line after processing in the CodeBlock case
Added two tests:
- test_code_fence_after_blank_line: code fence with trailing newline
- test_code_fence_no_trailing_newline: code fence without trailing newline
The agent mode header now shows:
- Agent name in uppercase with box art
- Working directory (truncated if too long)
- Status indicators for README, AGENTS.md, and Memory loading
- Task preview if provided
Also exports truncate_for_display and adds truncate_path_for_display
helper functions in project_files module.
The JSON tool call filter was outputting newlines immediately as they
were encountered. When the LLM output contained multiple newlines before
a tool call, each newline was output before the tool call JSON was
detected and suppressed, leaving orphaned blank lines in the output.
Changes:
- Add pending_newlines field to FilterState to buffer newlines at line start
- First newline after content is output immediately, subsequent ones buffered
- When tool call confirmed, pending_newlines cleared (suppressing extra blanks)
- When not a tool call, pending_newlines output with the buffer
- Add flush_json_tool_filter() to flush pending content at end of streaming
- Update tests to reflect new behavior
- Add tests for newline suppression behavior
Extract three cohesive modules from the monolithic lib.rs (3188 -> 2785 lines):
- metrics.rs (147 lines): Turn metrics tracking and histogram generation
- TurnMetrics struct
- format_elapsed_time() for human-readable durations
- generate_turn_histogram() for performance visualization
- Added unit tests for core functions
- project_files.rs (181 lines): Project file reading utilities
- read_agents_config() for AGENTS.md loading
- read_project_readme() for README detection
- read_project_memory() for .g3/memory.md
- extract_readme_heading() for display
- Added unit tests
- coach_feedback.rs (129 lines): Coach feedback extraction from session logs
- extract_from_logs() main entry point
- Helper functions for log parsing and text extraction
All modules have clear single responsibilities, improved documentation,
and maintain identical behavior to the original inline functions.
Agent: carmack
Collapsed nested if statements that check related conditions into
single conditions using &&. This improves readability by making
the logical relationship between conditions explicit.
Files changed:
- feedback_extraction.rs: 3 instances of tool_use/final_output checks
- tools/todo.rs: 1 instance of todo completion check
Agent: fowler
The function had two branches that both returned line.to_string():
- when !should_truncate
- when line.chars().count() <= max_width
Merged into a single condition. Also updated format! to use
inline variable syntax per clippy suggestion.
Agent: fowler
Agent mode was only loading README.md but not AGENTS.md or project
memory (.g3/memory.md). This meant agents were missing important
context that normal mode had access to.
Now agent mode uses the same read_agents_config(), read_project_readme(),
and read_project_memory() functions as normal mode, combining all three
into the agent context.
Simplify auto-memory by always enabling it in agent mode instead of
requiring the --auto-memory flag. This makes sense because:
- Agent mode is non-interactive, so blocking is acceptable
- Agents benefit from automatically saving discoveries to memory
- Reduces flag complexity for users
The --auto-memory flag still works for other modes if desired.
The --auto-memory flag was not being passed to run_agent_mode() and
send_auto_memory_reminder() was not being called after agent task
execution.
Changes:
- Pass auto_memory parameter to run_agent_mode()
- Add auto_memory parameter to run_agent_mode() function signature
- Call agent.set_auto_memory(true) when flag is enabled
- Call send_auto_memory_reminder() after execute_task() in agent mode
Adds a new --auto-memory CLI flag that automatically sends a reminder
to the LLM after each turn where tools were called, prompting it to
call the remember tool if it discovered any key code locations.
Changes:
- Add auto_memory field and set_auto_memory() method to Agent
- Add tool_calls_this_turn tracking in execute_tool_in_dir()
- Add send_auto_memory_reminder() that sends reminder after tool use
- Add --auto-memory CLI flag and wire it up in console/machine modes
- Call send_auto_memory_reminder() in single-shot and interactive modes
- Add visible status messages for auto-memory actions
Fixes bug where tool calls were not being tracked when execute_tool_in_dir
was called directly with working_dir=None.
The format_header() function was not calling format_inline_content()
to process inline formatting like **bold**, *italic*, and `code`
within headers. This caused raw markdown markers to appear in output.
Added 4 tests to verify the fix:
- test_bold_inside_header
- test_italic_inside_header
- test_code_inside_header
- test_mixed_formatting_inside_header
The code fence (```) was not being properly detected during streaming,
causing it to be rendered as inline code instead of a code block.
Root cause: When buffering a code fence after seeing ```, the code
was returning early for ALL characters including newlines. This meant
handle_newline() was never called and block_state was never set to
BlockState::CodeBlock.
Fixes:
- Don't return early for newlines when buffering code fence, allow them
to fall through to handle_newline()
- Support indented code fences (up to 3 spaces per CommonMark spec) by
using trim_start() when checking for ``` at line start
- Remove IMPORTANT FOR CODING section (~1,500 chars of coding guidelines)
- Remove <use_parallel_tool_calls> block (~500 chars)
- Remove unused const_format dependency from g3-core
- Simplify get_system_prompt_for_native() to just return base prompt
- Response Guidelines now cleanly ends the static prompt
Prompt reduced from ~8,500 to ~6,500 characters.
- Add description field to SessionContinuation struct
- Extract first user message (truncated to ~60 chars at word boundary)
- Display as quoted text instead of session ID hash
- Fall back to session ID if no description available
Example: [2 hours ago] 'when I call /resume it only shows me 2 sessions...'
- Change run_autonomous to return Agent instead of () so session
continuation is properly saved in accumulative mode
- Update format_session_time to show relative times ("2 hours ago",
"yesterday") for recent sessions and dates for older ones
- Handle Ctrl+C cancellation gracefully with informative message
Fix session detection:
- Add save_session_continuation() calls at all session exit points
- Sessions now properly create .g3/session symlink for resume detection
- Fixes issue where g3 wasn't offering to resume previous sessions
Add /resume command:
- New list_sessions_for_directory() to scan available sessions
- New switch_to_session() method to safely switch between sessions
- Shows numbered list with timestamps, context %, and TODO status
- Saves current session before switching (can be resumed later)
- Restores full context if <80% used, otherwise uses summary
- Machine mode supports /resume and /resume <number>
Documentation:
- Add /clear and /resume to CONTROL_COMMANDS.md
- Update /help output with new commands
When the scout agent fails (e.g., context window exhaustion), now:
- Captures both stdout and stderr from the scout process
- Detects context window exhaustion errors with specific patterns
- Provides detailed, actionable error messages to the user
- Shows suggestions for how to work around the issue
- Includes technical details (exit code, error output) for debugging
Handles two failure modes:
1. Scout agent exits with non-zero status
2. Scout agent exits successfully but doesn't produce valid report markers
Both cases now surface clear error messages instead of cryptic failures.
Runs automatically when --chrome-headless flag is used, checking:
- ChromeDriver installation and PATH
- Chrome/Chromium installation
- Chrome and ChromeDriver version compatibility
- config.toml chrome_binary setting
- Chrome for Testing installation
- ChromeDriver executable permissions (macOS quarantine)
Displays a detailed report with:
- Summary of detected versions and paths
- Pass/warning/error status for each check
- Specific fix suggestions for any issues found
Users can then ask g3 to help fix any detected issues.
The timing footer (e.g., ⏱️ 19.4s | 💭 4.7s) was being saved to the
conversation history as a separate assistant message. This happened
because stream_completion_with_tools returns the timing footer in
TaskResult.response for display, but the caller was also saving it
to context.
Fix: Strip the timing footer (identified by \n\n⏱️) before saving
to context window. The timing footer remains display-only.
Also includes:
- Research tool blank line fix: only add visual separator for research
tool output, not all tools
- Research tool webdriver propagation: pass parent's webdriver browser
choice (Safari vs Chrome headless) to scout subprocess
final_output removal:
- Remove final_output from tool definitions and dispatch
- Update system prompts to request summaries as regular text
- Remove final_output_called field from StreamingState
- Update auto_continue tests to remove final_output_called parameter
- Remove final_output test from tool_execution_test.rs
- Update planner and flock prompts to not reference final_output
- Keep backwards-compat code in feedback_extraction.rs and task_result.rs
Scout report handback:
- Change from file-based to delimiter-based report extraction
- Scout outputs report between ---SCOUT_REPORT_START/END--- markers
- Research tool extracts content between markers, strips ANSI codes
- Add comprehensive tests for extraction and ANSI stripping
657 tests pass.
The research tool now streams the underlying scout agent's output
to the CLI in real-time for visual indication of progress. This
output is displayed but not added to the conversation context.