714 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
d600b600b8 Always keep chromedriver running for faster subsequent startups
Removed the persistent_chrome config flag - chromedriver is now always
kept running after webdriver_quit. This eliminates startup latency for
subsequent WebDriver sessions.

Safaridriver is still killed on quit since it doesn't benefit from
persistence in the same way.

Updated quit message to correctly indicate chromedriver remains running.
2026-01-17 09:48:10 +05:30
Dhanji R. Prasanna
8ed360024f Add persistent ChromeDriver support for faster WebDriver startup
When webdriver_start is called, now checks if chromedriver is already
running on the configured port and reuses it instead of spawning a new
process. This significantly reduces startup time for subsequent sessions.

New config option:
  [webdriver]
  persistent_chrome = true  # Keep chromedriver running between sessions

When enabled, webdriver_quit closes the browser session but leaves
chromedriver running for reuse by the next session.
2026-01-17 09:26:25 +05:30
Dhanji R. Prasanna
eb6268641f Fix --safari flag being blocked by Chrome diagnostics
When --safari was passed, Chrome diagnostics were still running because
--chrome-headless defaults to true. This caused the CLI to hang while
running diagnostics for a browser that wouldn't be used.

Now skip Chrome diagnostics when --safari is explicitly set.
2026-01-17 09:20:21 +05:30
Dhanji R. Prasanna
e3967a9948 refactor: remove animation from context thinning display
Simplify print_context_thinning to just print the message directly.
The message already contains proper ANSI formatting from context_window.rs.

Removes the flash animation and 'Context optimized successfully' footer.
2026-01-17 05:00:12 +05:30
Dhanji R. Prasanna
b8193bf9f9 style: use orange color for [no changes] status in thinning message 2026-01-17 04:53:42 +05:30
Dhanji R. Prasanna
74b1b9bea3 refactor: simplify context thinning status message
Change format from verbose emoji-based message to cleaner status line:
  Before:  🥒 Context thinned at 70%: 7 tool results, ~33839 chars saved 
  After:  g3: thinning context ... 70% -> 40% ... [done]

The new format shows before/after percentages and uses bold green for
'g3:' and '[done]' to match other status messages.

Also removes unused emoji() and label() methods from ThinScope.
2026-01-17 04:47:16 +05:30
Dhanji R. Prasanna
c7984fd4c2 fix: account for base64 encoding overhead in image size limit
The Anthropic API has a 5MB limit on base64-encoded images, not raw file
size. Base64 encoding increases size by ~33% (4/3 ratio), so a 4MB raw
image becomes ~5.3MB encoded, exceeding the limit.

Changed MAX_IMAGE_SIZE from 5MB to ~3.75MB (5MB * 3/4) to trigger
resizing before the base64-encoded result exceeds the API limit.

Also updated target resize size to 3.6MB to leave margin.
2026-01-16 21:29:05 +05:30
Dhanji R. Prasanna
1003386f7f Auto-resize large images (>=5MB) in read_image tool
Images >= 5MB are now automatically resized to < 4.9MB using ImageMagick
before being sent to the LLM. This prevents API errors from oversized images.

- Uses iterative quality/scale reduction to find optimal size
- Converts to JPEG for better compression
- Shows original and resized size in terminal output (e.g., '6.2 MB → 4.1 MB (resized)')
- Falls back to original if ImageMagick fails or isn't available
2026-01-16 21:09:38 +05:30
Dhanji R. Prasanna
fc702168ab Add streaming completion integration test with mock LLM provider
Adds tests to verify that:
- All streaming chunks are processed before control returns to caller
- Both tool calls in a multi-tool-call stream are executed
- The finished signal properly terminates stream processing

Also adds Agent::new_for_test() to allow injecting mock providers.
2026-01-16 20:52:32 +05:30
Dhanji R. Prasanna
0e33465342 Add print_g3_progress/print_g3_status methods for consistent status messages 2026-01-16 20:28:24 +05:30
Dhanji R. Prasanna
95f89d3f8e Simplify compaction status messages 2026-01-16 20:26:35 +05:30
Dhanji R. Prasanna
415226ca84 Add newline before context progress display 2026-01-16 20:24:29 +05:30
Dhanji R. Prasanna
cebec23075 Fix duplicate response printing in interactive mode
The response was being printed twice: once during streaming and again
after task completion. Removed the redundant print_smart() call since
streaming already displays the response in real-time.
2026-01-16 14:48:50 +05:30
Dhanji R. Prasanna
4c6878a63d Set process title to agent name in agent mode
When running g3 --agent butler, the process title is now "g3 [butler]"
which shows up in ps, Activity Monitor, top, etc.

Uses the proctitle crate for cross-platform support.
2026-01-16 14:37:58 +05:30
Dhanji R. Prasanna
1f6a5671b2 Use agent name as prompt in --agent --chat mode (e.g., "butler>")
Changed run_interactive() parameter from bool to Option<&str> agent_name.
When agent_name is Some, use it as the prompt instead of "g3>".
2026-01-16 13:58:45 +05:30
Dhanji R. Prasanna
2e6bef4b24 Auto-memory: call once on exit for --agent --chat, per-turn for single-shot
When running g3 --agent <name> --chat:
- Skip per-turn memory checkpoint calls (too onerous)
- Call memory checkpoint once when exiting (Ctrl-D)

When running g3 --agent <name> (single-shot):
- Preserve existing behavior: call memory checkpoint after each turn

This keeps the auto-memory feature useful without being intrusive
in interactive agent sessions.
2026-01-16 13:35:40 +05:30
Dhanji R. Prasanna
6068249827 Simplify --agent --chat startup: minimal output, no session resume
When running g3 --agent <name> --chat, the output is now minimal:
- Workspace path (-> ~/path)
- Status line (README/AGENTS.md/Memory)
- Context progress bar
- Prompt (g3>)

Skipped in this mode:
- Session resume prompts
- "agent mode | name (source)" header
- "g3 programming agent" welcome
- Provider info display
- Language guidance messages

Added from_agent_mode parameter to run_interactive() to control
whether verbose welcome and session resume are shown.
2026-01-16 13:31:10 +05:30
Dhanji R. Prasanna
7c59d1993c Fix auto-memory JSON leak: tool call printed raw to UI
The JSON filter only suppresses tool calls at line boundaries. When
"Memory checkpoint: " was printed without a trailing newline, the LLM
response `{"tool": "remember", ...}` appeared on the same line and
leaked through to the UI.

Fix:
- Add trailing newline to "Memory checkpoint:" message
- Reset JSON filter state before streaming the response

Added test: test_tool_call_not_at_line_start_passes_through
Documents the filter behavior and references the fix location.
2026-01-16 13:10:18 +05:30
Dhanji R. Prasanna
94544c8f6a Add interactive mode support for agents with --chat flag
- Remove chat from conflicts_with_all for --agent flag
- Add chat parameter to run_agent_mode()
- Run interactive loop instead of single task when --chat is passed

Usage: g3 --agent <name> --chat
2026-01-16 12:01:56 +05:30
Dhanji R. Prasanna
6bd9c51e8e feat: shell output pagination and optimized read_file with seek
- Shell outputs > 8KB are truncated to first 500 chars
- Full output saved to .g3/sessions/<session_id>/tools/shell_stdout_<id>.txt
- LLM can use read_file with start/end to paginate through large outputs
- read_file now uses seek() for O(1) random access instead of reading entire file
- UTF-8 safe: reads extra bytes at boundaries to find valid char positions
- Falls back to lossy conversion for binary files (no panics)

Files changed:
- paths.rs: get_tools_output_dir(), generate_short_id()
- shell.rs: truncate_large_output() integration
- file_ops.rs: seek-based read_file_range() helper
- New test: read_file_utf8_test.rs
2026-01-16 09:16:16 +05:30
Dhanji R. Prasanna
ce5183b296 style: compress studio auto-accept output
- Replace verbose auto-accept messages with single line
- Format: 'studio: session <id> ... [merged]'
- Refactor cmd_accept to use accept_session() with configurable prefix
- Remove 'completed successfully' and 'Auto-accepting' messages
2026-01-16 07:30:27 +05:30
Dhanji R. Prasanna
e2385faba1 style: compress studio session startup output
- Replace verbose multi-line output with single line
- Format: 'studio: new session <id>'
- 'studio:' in bold green, session id in inline-code orange (RGB 216,177,114)
- Remove separator lines and 'Starting g3 agent' message
2026-01-16 07:24:22 +05:30
Dhanji R. Prasanna
ef5aa75e6b style: simplify studio accept/discard output messages
- Change verbose emoji messages to minimal format
- Print '> session <id> ...' first, then status after operation completes
- 'merged' shown in bold green
- 'discarded' shown in bold yellow
2026-01-16 07:17:36 +05:30
Dhanji R. Prasanna
01cb4f6691 fix: use consistent max_tokens defaults across providers
- Fix aliasing issue where resolve_max_tokens() used fallback_default_max_tokens
  (8192) instead of provider-specific defaults
- Update fallback_default_max_tokens from 8192 to 32000
- Set provider-specific max_tokens defaults:
  - Anthropic: 32000
  - OpenAI: 32000 (was 16000)
  - Databricks: 32000 (was 50000, now matches Anthropic as passthru)
  - Embedded: 2048
- Context window lengths unchanged:
  - OpenAI: 400,000
  - Anthropic: 200,000
  - Databricks (Claude): 200,000

This fixes the 'LLM response was cut off due to max_tokens limit' error
in agent mode that occurred because 8192 was being used instead of 32000.
2026-01-16 07:05:57 +05:30
Dhanji R. Prasanna
65e0217c68 Add unit tests for studio session management
New tests:
- test_new_session_has_short_id
- test_new_interactive_session
- test_branch_name_format
- test_session_save_and_load
- test_session_mark_complete
- test_session_mark_paused
- test_list_empty_sessions
- test_backwards_compatibility_no_session_type

Added tempfile as dev dependency for temp directory tests.
2026-01-16 06:52:23 +05:30
Dhanji R. Prasanna
78f9207d27 Add interactive mode to studio
New commands:
- studio cli (alias: c) - Start a new interactive g3 session in an isolated worktree
- studio resume <id> (alias: r) - Resume a paused interactive session
- Bare 'studio' now defaults to 'studio cli'

Session changes:
- Added SessionStatus::Paused for sessions that can be resumed
- Added SessionType enum (OneShot, Interactive) for future use
- Interactive sessions use inherited stdio for direct TTY access
- Sessions are marked as Paused when user exits g3

Workflow:
1. studio        # creates worktree, runs g3 interactively
2. (work in g3, exit when done)
3. studio resume <id>  # continue working
4. studio accept <id>  # merge to main when finished
2026-01-16 06:48:24 +05:30
Dhanji R. Prasanna
637884f84b Fix duplicate todo_read display in agent mode
The print_todo_compact() function was missing the call to clear the
streaming hint line before printing the final tool output. This caused
the tool name to appear twice when the hint line wasn't cleared:

  ● todo_read     ● todo_read   | empty

Added the missing handle_hint(ToolParsingHint::Complete) call to match
the behavior of print_tool_compact().
2026-01-16 06:38:11 +05:30
Dhanji R. Prasanna
25d35529e7 Fix --accept flag being passed through to g3 in studio run
When --accept was passed after positional args (e.g., 'studio run --agent
carmack task --accept'), clap's trailing_var_arg captured it as part of
g3_args instead of parsing it as the studio flag. This caused g3 to error
with 'unexpected argument --accept'.

- Extract filter_accept_flag() helper to detect and remove --accept from
  trailing args
- Set auto_accept=true if --accept found in either position
- Add 5 unit tests for the filtering logic
2026-01-15 21:05:13 +05:30
Dhanji R. Prasanna
a84fead03b refactor: improve readability of streaming parser and JSON filter
Agent: carmack

Changes:
- streaming_parser.rs: Unified find_first/last_tool_call_start into single
  find_tool_call_start with SearchDirection enum, reducing duplication.
  Simplified is_json_invalidated from 45 to 20 lines with clearer logic.
  Fixed redundant !escape_next check in find_complete_json_object_end.

- filter_json.rs: Simplified check_tool_pattern from 40 to 24 lines.
  Replaced repetitive prefix checks with loop over ["t", "to", "too", "tool"].
  Reduced trailing return statements with direct expression returns.

- ui_writer_impl.rs: Added ansi module for duration color constants.
  Simplified duration_color function by removing redundant comments.

- language_prompts.rs: Fixed test assertions to match actual prompt content
  ("obvious, readable Racket" instead of "RACKET-SPECIFIC GUIDANCE").

All 174+ tests pass. No behavior changes.
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
0ae1a13cdb feat: real-time tool call streaming indicator with blinking UI
- Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback
- New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active()
- Refactor ConsoleUiWriter state to use atomics in ParsingHintState
- Add tool_call_streaming field to CompletionChunk for provider hints
- Anthropic provider sends streaming hints when tool name detected
- New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active()

Parser improvements:
- Add is_json_invalidated() to detect false positive tool patterns
- Fix tool result poisoning when file contents contain partial JSON
- Unescaped newlines in strings or prose after JSON invalidates detection

User sees ' ● tool_name |' immediately when tool call starts streaming,
with blinking indicator while args are received.
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
d68f059acf fix: detect invalidated JSON tool calls to prevent parser poisoning
When partial JSON tool call patterns appear in LLM output (e.g., from
quoting file content), the parser would incorrectly report them as
"incomplete tool calls", triggering auto-continue loops.

Fix: Added is_json_invalidated() to detect when partial JSON has been
invalidated by subsequent content that cannot be valid JSON:
- Unescaped newline inside a string (invalid JSON)
- Newline followed by prose text outside a string

The check is only applied to incomplete JSON - complete tool calls
with trailing text are still correctly detected.

Added 6 new tests covering:
- Tool results with partial JSON patterns
- LLM quoting file content inline vs on own line
- Comment prefixes (// # -- etc) with partial patterns
- Real incomplete tool calls (should still be detected)
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
999ac6fe66 fix: prevent parser poisoning from inline tool-call JSON patterns
The streaming parser was incorrectly detecting tool call patterns that
appeared inline in prose (e.g., when explaining the format), causing
g3 to return control mid-task.

Fix: Modified find_first_tool_call_start() and find_last_tool_call_start()
to only recognize patterns that appear on their own line (at start of
buffer or after newline with only whitespace before the pattern).

Changes:
- Added is_on_own_line() helper to check line-boundary conditions
- Updated detection methods to skip inline patterns
- Removed sanitize_inline_tool_patterns() and LBRACE_HOMOGLYPH (no longer needed)
- Rewrote tests for new behavior
- Added streaming_repro tests that use process_chunk() to verify the exact bug scenario

28 tests covering: streaming repro, line boundaries, Unicode, code contexts, edge cases
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
65807eea99 Add carmack.rust.md agent-specific language prompt
Rust-specific readability guidance for the carmack agent including:
- let...else example for shallow control flow
- Async: don't block the runtime (tokio::fs, spawn_blocking, Send)
- Visibility: prefer pub(crate), private fields with accessors
- Generics: impl Trait over explicit params, avoid complex where clauses
- Improved iterator guidance: if you need a comment, use a loop
- UTF-8 string slicing warnings
- Ownership/lifetime pragmatism
- Anti-patterns: no macros/typestate/proc-macros unless already in repo

Also adds Rust detection to LANGUAGE_PROMPTS (empty base prompt,
agent-specific prompts handle the guidance).
2026-01-15 13:49:29 +05:30
Jochen
6d1aa62ba7 Merge pull request #63 from cjustice/fix/tracing-subscriber-panic
Fix tracing subscriber panic in scout agent
2026-01-15 12:54:31 +11:00
Jochen
0bca05a1ba Merge pull request #62 from cjustice/fix/planning-verbose-flag
Fix: Initialize logging before planning mode check
2026-01-15 12:51:11 +11:00
Dhanji R. Prasanna
04e3c69b0a Add --accept flag to studio run command
Automatically accept the session after g3 completes successfully,
but only if there are commits on the branch.

Changes:
- Add --accept flag to Run command (stripped, not passed to g3)
- Add has_commits_on_branch() helper using git rev-list --count
- Auto-accept triggers merge to main and cleanup when:
  1. g3 exits successfully (exit code 0)
  2. Branch has commits ahead of main
- Show warning if --accept set but no commits exist

Usage: studio run --agent carmack --accept
2026-01-15 06:43:35 +05:30
Dhanji R. Prasanna
5d8dbc43f8 Add agent-specific language prompt injection
When running in agent mode (e.g., --agent carmack) in a workspace with
detected languages, inject agent+language-specific prompts from
prompts/langs/<agent>.<lang>.md at the end of the system prompt.

Changes:
- Add AGENT_LANGUAGE_PROMPTS static array for compile-time embedding
- Add get_agent_language_prompt() to look up specific agent+lang combos
- Add get_agent_language_prompts_for_workspace_with_langs() that returns
  both content and matched languages for display
- Update agent_mode.rs to inject prompts and show which languages loaded
- Display format: '✓ carmack: racket language guidance'
- Add tests for new functionality

Uses the same detect_languages() mechanism as regular language prompts
to avoid code-path aliasing.
2026-01-15 06:43:29 +05:30
Connor Justice
fa29a64e51 Simplify logging initialization comment
Removed unnecessary comment about logging initialization.
2026-01-14 17:53:04 -05:00
Connor Justice
505225c0bd fix: prevent panic when tracing subscriber already initialized
Use try_init() instead of init() for tracing subscriber setup to
gracefully handle cases where a global subscriber is already set.

This fixes a panic in the scout agent subprocess when spawned by the
research tool, where a dependency may have already initialized tracing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 15:33:22 -05:00
Connor Justice
6532442d32 fix: initialize logging before planning mode check
Move initialize_logging() call to run immediately after CLI parsing,
before any mode checks. This ensures the --verbose flag works correctly
in planning mode, which previously bypassed logging initialization.

Previously, planning mode would return early before initialize_logging()
was called, causing verbose output to be silently ignored.
2026-01-14 14:33:44 -05:00
Dhanji R. Prasanna
afec65fd50 Add language-specific prompt injection for toolchain guidance
- Add language_prompts module that auto-detects programming languages in workspace
- Scan for language files with depth limit (2) to inject relevant toolchain prompts
- Add prompts/langs/ directory for language-specific markdown files
- Include Racket/raco toolchain guidance as first language prompt
- Update combine_project_content() to accept language_content parameter
- Integrate language detection into main CLI flow and agent mode
- Update project memory with new feature documentation
2026-01-14 21:00:52 +05:30
Dhanji R. Prasanna
f4562cd4c9 config: default agent settings and provider override 2026-01-14 20:14:33 +05:30
Dhanji R. Prasanna
38828c7757 Clean up tool output formatting
- Shell: " Command executed successfully" → "️ ran successfully"
- Write file: Remove ✏️ emoji, use plain "wrote N lines | M chars"
2026-01-14 19:42:54 +05:30
Dhanji R. Prasanna
9ef064a041 Add guidance to shell tool description to avoid unnecessary cd prefixes
LLMs were prefixing shell commands with `cd <workspace> &&` unnecessarily,
wasting tokens and cluttering CLI display. Added clear guidance in the
shell tool description that commands already execute in the working directory.
2026-01-14 19:00:53 +05:30
Dhanji R. Prasanna
03143ec7f8 Agent Mode Enhancements
• Agent prompts are now embedded within the g3 binary
• README.md - Added new "Agent Mode" section documenting:
  • All 7 built-in agents with their focus areas
  • Usage examples (--list-agents, --agent <name>)
  • How to create custom workspace agents

Behavior
1. Workspace agents take priority - If agents/<name>.md exists in the workspace, it's used
2. Embedded fallback - If no workspace agent exists, the embedded version is used
3. Portability - g3 binary now works on any repo without needing the agents/ directory
4. Discoverability - g3 --list-agents shows all available agents and their source
2026-01-14 16:27:03 +05:30
Dhanji R. Prasanna
5104bd53b6 refactor(g3-core): improve stream_completion_with_tools readability
Extract and simplify the streaming completion function:

- Extract ensure_context_capacity() helper for pre-loop context management
  (thinning + compaction logic now in dedicated async method)
- Simplify compact_summary generation block: flatten nested if/match,
  remove redundant comments, reorder branches for clarity
- Remove dead code: unused _last_error variable and modified_tool_call
- Streamline duplicate detection block: reduce verbose logging
- Clean up text content display block: remove redundant comments,
  tighten variable declarations
- Remove redundant is_todo_tool redefinition inside block expression

Net reduction: 79 lines (-187/+108)
Behavior unchanged, all unit tests pass.

Agent: carmack
2026-01-14 15:11:53 +05:30
Dhanji R. Prasanna
996dc357b4 Skip session resume prompt when --new-session flag is passed
When users explicitly pass --new-session, they want a fresh session.
Previously g3 would still prompt to resume an existing session.
Now the resume check is skipped entirely when the flag is set.
2026-01-14 08:54:35 +05:30
Dhanji R. Prasanna
dea0e6b1ca Compact tool output improvements
- Rename take_screenshot -> screenshot, code_coverage -> coverage (shorter names)
- Align | character across all compact tools (pad to 11 chars for str_replace)
- Make code_search a compact tool with summary display
- Show language and search name in code_search output (e.g., rust:"find structs")
- Add format_code_search_summary() to extract match/file counts from JSON response
2026-01-14 08:12:50 +05:30
Dhanji R. Prasanna
bd25d7dace Merge sessions/fowler/786b20b5 2026-01-14 04:28:06 +05:30
Dhanji R. Prasanna
7d17b436f9 refactor(g3-core): remove 3 unused Agent constructor variants
Remove dead code - constructor variants that had no callers:
- new_with_readme()
- new_autonomous_with_readme()
- new_with_quiet()

These were thin wrappers around new_with_mode_and_readme() that were
never used externally. All 5 remaining constructors have verified callers.

Results:
- lib.rs reduced from 2817 to 2797 lines (-20 lines)
- Eliminated code-path aliasing: 8 constructors → 5 constructors
- All g3-core tests pass
- Full workspace compiles cleanly

Agent: fowler
2026-01-14 04:26:42 +05:30