Automatically accept the session after g3 completes successfully,
but only if there are commits on the branch.
Changes:
- Add --accept flag to Run command (stripped, not passed to g3)
- Add has_commits_on_branch() helper using git rev-list --count
- Auto-accept triggers merge to main and cleanup when:
1. g3 exits successfully (exit code 0)
2. Branch has commits ahead of main
- Show warning if --accept set but no commits exist
Usage: studio run --agent carmack --accept
When running in agent mode (e.g., --agent carmack) in a workspace with
detected languages, inject agent+language-specific prompts from
prompts/langs/<agent>.<lang>.md at the end of the system prompt.
Changes:
- Add AGENT_LANGUAGE_PROMPTS static array for compile-time embedding
- Add get_agent_language_prompt() to look up specific agent+lang combos
- Add get_agent_language_prompts_for_workspace_with_langs() that returns
both content and matched languages for display
- Update agent_mode.rs to inject prompts and show which languages loaded
- Display format: '✓ carmack: racket language guidance'
- Add tests for new functionality
Uses the same detect_languages() mechanism as regular language prompts
to avoid code-path aliasing.
Racket-specific guidance for the carmack agent including:
- Idiomatic Racket patterns (match, for/*, cond)
- Module organization with explicit provide lists
- Contracts and type boundaries
- Data modeling with structs
- Error handling best practices
- IO, paths, and portability
- Performance considerations
- Macro guidelines
- Testing with rackunit
Use try_init() instead of init() for tracing subscriber setup to
gracefully handle cases where a global subscriber is already set.
This fixes a panic in the scout agent subprocess when spawned by the
research tool, where a dependency may have already initialized tracing.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Move initialize_logging() call to run immediately after CLI parsing,
before any mode checks. This ensures the --verbose flag works correctly
in planning mode, which previously bypassed logging initialization.
Previously, planning mode would return early before initialize_logging()
was called, causing verbose output to be silently ignored.
- Add language_prompts module that auto-detects programming languages in workspace
- Scan for language files with depth limit (2) to inject relevant toolchain prompts
- Add prompts/langs/ directory for language-specific markdown files
- Include Racket/raco toolchain guidance as first language prompt
- Update combine_project_content() to accept language_content parameter
- Integrate language detection into main CLI flow and agent mode
- Update project memory with new feature documentation
LLMs were prefixing shell commands with `cd <workspace> &&` unnecessarily,
wasting tokens and cluttering CLI display. Added clear guidance in the
shell tool description that commands already execute in the working directory.
• Agent prompts are now embedded within the g3 binary
• README.md - Added new "Agent Mode" section documenting:
• All 7 built-in agents with their focus areas
• Usage examples (--list-agents, --agent <name>)
• How to create custom workspace agents
Behavior
1. Workspace agents take priority - If agents/<name>.md exists in the workspace, it's used
2. Embedded fallback - If no workspace agent exists, the embedded version is used
3. Portability - g3 binary now works on any repo without needing the agents/ directory
4. Discoverability - g3 --list-agents shows all available agents and their source
When users explicitly pass --new-session, they want a fresh session.
Previously g3 would still prompt to resume an existing session.
Now the resume check is skipped entirely when the flag is set.
- Rename take_screenshot -> screenshot, code_coverage -> coverage (shorter names)
- Align | character across all compact tools (pad to 11 chars for str_replace)
- Make code_search a compact tool with summary display
- Show language and search name in code_search output (e.g., rust:"find structs")
- Add format_code_search_summary() to extract match/file counts from JSON response
Remove dead code - constructor variants that had no callers:
- new_with_readme()
- new_autonomous_with_readme()
- new_with_quiet()
These were thin wrappers around new_with_mode_and_readme() that were
never used externally. All 5 remaining constructors have verified callers.
Results:
- lib.rs reduced from 2817 to 2797 lines (-20 lines)
- Eliminated code-path aliasing: 8 constructors → 5 constructors
- All g3-core tests pass
- Full workspace compiles cleanly
Agent: fowler
- Updated memory reminder prompt with per-symbol char ranges
- Added two few-shot examples: Session Continuation (feature) + UTF-8 Safe Slicing (pattern)
- Updated system prompt Memory Format section to match
- Format: file -> nested symbols with [start..end] ranges and descriptions
- Enables direct read_file navigation to specific functions
- webdriver flag now defaults to true (tools always available)
- chrome_headless flag now defaults to true (Chrome is default browser)
- Use --safari flag to override and use Safari instead
- Updated README documentation to reflect new defaults
Extract the get_stats() function (158 lines) from lib.rs to a new stats.rs module.
Changes:
- Create stats.rs with AgentStatsSnapshot struct for capturing agent state
- Replace inline formatting logic with delegation to snapshot.format()
- Add unit tests for stats formatting (empty and populated states)
- Reduce lib.rs from 2961 to 2818 lines (-143 lines)
The new module improves:
- Testability: Stats formatting can now be unit tested in isolation
- Separation of concerns: Formatting logic is decoupled from Agent struct
- Readability: lib.rs is more focused on core agent behavior
All 271 workspace tests pass.
Agent: fowler
- Remove unused total_lines variable in file_ops.rs
- Fix UTF-8 boundary panic in utils.rs when generating diff error preview
The code was slicing at byte index 200 which could land inside a
multi-byte character (e.g., box-drawing chars like ─). Now uses
character-based slicing with chars().take() instead.
Studio can now run g3 without specifying an agent:
# Agent mode (existing)
studio run --agent carmack "fix the bug"
# One-shot mode (new)
studio run "fix the bug"
When no agent is specified, sessions are created under the 'single'
directory in .worktrees/sessions/single/<session-id>/
This makes Studio a complete replacement for Flock mode.
Standardize project name to lowercase 'g3' throughout documentation,
comments, and configuration files. Environment variables (G3_*) are
unchanged as they follow the uppercase convention.
Improve readability of stream_completion_with_tools (~1000 line function):
- Add deduplicate_tool_calls() helper with closure for previous-message check
- Add should_auto_continue() with AutoContinueReason enum for clearer control flow
- Replace inline deduplication loop with helper call (-19 lines)
- Replace complex auto-continue conditional with match on reason enum (-13 lines)
- Add section comments for major phases (State Init, Pre-loop, Main Loop, Auto-Continue, Post-Loop)
- Add comprehensive tests for new helpers
Net reduction: 82 deletions, behavior unchanged (172+ tests pass)
Agent: carmack
When the streaming parser encountered fragments of JSON that looked like
partial tool calls (e.g., {"tool":) embedded in inline text (like code
examples or prose), it would incorrectly enter JSON parsing mode and
poison the parser state, causing control to be returned to the user
mid-task.
This fix:
- Adds sanitize_inline_tool_patterns() to detect tool-call patterns that
are NOT on their own line and replace the opening brace with a Unicode
homoglyph (fullwidth left curly bracket U+FF5B)
- Integrates sanitization into process_chunk() before text is buffered
- Updates system prompts to instruct LLMs to use homoglyphs when showing
example tool call JSON in prose
- Adds comprehensive tests for the sanitization logic
Real tool calls from LLMs always appear on their own line, so those are
left untouched. Only inline patterns (with non-whitespace before them)
are sanitized.
When the LLM reads the same file multiple times in sequence (scrolling
through a large file), instead of showing each as a separate line:
● read_file | path [0..2000] | 50 lines | 100 ◉ 5ms
● read_file | path [2000..4000] | 50 lines | 100 ◉ 5ms
● read_file | path [4000..6000] | 50 lines | 100 ◉ 5ms
Now shows a cleaner continuation format:
● read_file | path [0..2000] | 50 lines | 100 ◉ 5ms
└─ reading further [2000..4000] | 50 lines | 100 ◉ 5ms
└─ reading further [4000..6000] | 50 lines | 100 ◉ 5ms
This makes it visually clear that the agent is scrolling through
a single file rather than reading multiple different files.
Implementation:
- Added last_read_file_path field to ConsoleUiWriter
- Detect when consecutive read_file calls target the same file
- Print continuation format for subsequent reads
- Reset tracking when:
- A different tool is executed (shell, write_file, etc.)
- A different file is read
- Text is output between tool calls
Agent: hopper
Adds 56 new integration tests covering the observable end-of-turn
behaviors in the streaming module:
- Timing footer formatting (5 tests): verifies user-facing timing display
with various durations, token counts, and context percentages
- Tool call duplicate detection (6 tests): ensures identical sequential
tool calls are detected while different tools/args are not
- Empty response detection (9 tests): validates detection of empty,
whitespace-only, and timing-only responses that trigger auto-continue
- Connection error classification (5 tests): verifies EOF, connection,
chunk, and body errors are correctly identified for graceful recovery
- Tool output summary formatting (17 tests): covers read_file, write_file,
str_replace, remember, screenshot, coverage, and rehydrate summaries
- Duration formatting (4 tests): milliseconds, seconds, minutes, zero
- Text truncation (4 tests): short/long strings, multiline, flag behavior
- LLM token cleaning (3 tests): removal of stop tokens like <|im_end|>
- Edge cases (4 tests): empty inputs, unicode handling, large numbers
All tests are blackbox/characterization style - they test observable
outputs through stable public interfaces without encoding internal
implementation details. Tests remain stable under refactoring that
preserves behavior.