alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	5ab1598e03	feat: async research tool - runs in background, returns immediately The research tool now spawns the scout agent in a background tokio task and returns immediately with a research_id placeholder. This allows the agent to continue working while research runs (30-120 seconds). Key changes: - New PendingResearchManager for tracking async research tasks - research tool returns immediately with placeholder containing research_id - research_status tool to check progress of pending research - Auto-injection of completed research at natural break points: - Start of each tool iteration (before LLM call) - Before prompting user in interactive mode - /research CLI command to list all research tasks - Updated system prompt to explain async behavior The agent can: - Continue with other work while research runs - Check status with research_status tool - Yield turn to user if results are critical before continuing	2026-01-30 13:00:02 +11:00
Dhanji R. Prasanna	2e84f1ece0	test: fix ACD test race condition and add read_image characterization test - Fix test_rehydrate_success race condition by using UUID for unique session IDs - Add #[serial] attribute to prevent parallel execution conflicts - Improve cleanup to remove entire session directory tree - Add characterization test for resize_image_to_dimensions fallback behavior (documents fix from commit `af8b849` for media type preservation) Agent: hopper	2026-01-26 16:19:53 +11:00
Dhanji R. Prasanna	af8b849311	fix(read_image): use correct media type when resize fails to reduce size When resize_image_to_dimensions() returns a larger file than the original, we fall back to using the original bytes. Previously, was_resized was set to true if the original dimensions exceeded MAX_IMAGE_DIMENSION, which caused final_media_type to be set to 'image/jpeg' even though we were using the original PNG bytes. This caused Anthropic API errors like: 'Image does not match the provided media type image/jpeg' Fix: Set was_resized=false when falling back to original bytes, so the original media type (detected from magic bytes) is preserved.	2026-01-22 07:58:05 +05:30
Dhanji R. Prasanna	a34a3b08e9	Rename Project Memory to Workspace Memory Rename all references from "Project Memory" to "Workspace Memory" to avoid future conflation if a "project" concept is introduced later. Changes: - Rename read_project_memory() -> read_workspace_memory() - Update all prompts, tool descriptions, and comments - Update header parsing in memory.rs to use "# Workspace Memory" - Update display detection for "=== Workspace Memory ===" - Update documentation and analysis/memory.md 11 files changed, ~36 occurrences updated.	2026-01-21 14:08:42 +05:30
Dhanji R. Prasanna	10bce7f66f	Remove ANSI formatting codes from g3-core Move terminal formatting responsibility to g3-cli layer: - format_str_replace_summary(): Remove ANSI codes, add colorize_str_replace_summary() helper in CLI to apply green/red colors for insertions/deletions - format_timing_footer(): Remove dimming ANSI codes (now plain text) - str_replace tool result: Remove ANSI codes from success message Remaining acceptable ANSI usage in g3-core: - iTerm2 inline image protocol (terminal-specific escape sequence) - Image metadata dimming (direct print, would need larger refactor) - Terminal beep for stale TODO warning (audio, not visual) - ANSI stripping utility in research.rs (not output) This continues the separation of concerns: g3-core handles logic, g3-cli handles all terminal formatting.	2026-01-20 10:00:37 +05:30
Dhanji R. Prasanna	02655110d6	fix: auto-resize images exceeding 1568px dimension to prevent 413 Payload Too Large The Anthropic API was rejecting requests with multiple high-resolution images (~2000x3000 pixels each) even though individual file sizes were under limits. Root cause: Code only checked per-image file size (3.75MB), not dimensions. Claude recommends images ≤1568px on longest edge and has 32MB total request limit. Changes: - Add MAX_IMAGE_DIMENSION (1568px) and MAX_TOTAL_IMAGE_PAYLOAD (20MB) constants - Trigger resize when dimensions > 1568px (not just file size > 3.75MB) - Add new resize_image_to_dimensions() for dimension-constrained resizing - Track cumulative payload size across multiple images - Warn if total payload exceeds recommended limit Test results with Walking Dead comic images: - WD_0001_0001.jpg: 800KB 1987x3057 → 321KB 1019x1568 - WD_0001_1064.png: 150KB 1988x3057 → 143KB 1020x1568 - WD_0002_0001.jpg: 1023KB 1988x3056 → 292KB 1020x1568 - Total payload: ~2.5MB → ~1MB base64	2026-01-18 10:05:45 +05:30
Dhanji R. Prasanna	3a03ed0585	Fix imgcat aspect ratio by adding preserveAspectRatio=1 Images were being displayed as narrow vertical strips because iTerm2 wasn't preserving aspect ratio when only height was specified.	2026-01-17 18:50:00 +05:30
Dhanji R. Prasanna	d600b600b8	Always keep chromedriver running for faster subsequent startups Removed the persistent_chrome config flag - chromedriver is now always kept running after webdriver_quit. This eliminates startup latency for subsequent WebDriver sessions. Safaridriver is still killed on quit since it doesn't benefit from persistence in the same way. Updated quit message to correctly indicate chromedriver remains running.	2026-01-17 09:48:10 +05:30
Dhanji R. Prasanna	8ed360024f	Add persistent ChromeDriver support for faster WebDriver startup When webdriver_start is called, now checks if chromedriver is already running on the configured port and reuses it instead of spawning a new process. This significantly reduces startup time for subsequent sessions. New config option: [webdriver] persistent_chrome = true # Keep chromedriver running between sessions When enabled, webdriver_quit closes the browser session but leaves chromedriver running for reuse by the next session.	2026-01-17 09:26:25 +05:30
Dhanji R. Prasanna	c7984fd4c2	fix: account for base64 encoding overhead in image size limit The Anthropic API has a 5MB limit on base64-encoded images, not raw file size. Base64 encoding increases size by ~33% (4/3 ratio), so a 4MB raw image becomes ~5.3MB encoded, exceeding the limit. Changed MAX_IMAGE_SIZE from 5MB to ~3.75MB (5MB * 3/4) to trigger resizing before the base64-encoded result exceeds the API limit. Also updated target resize size to 3.6MB to leave margin.	2026-01-16 21:29:05 +05:30
Dhanji R. Prasanna	1003386f7f	Auto-resize large images (>=5MB) in read_image tool Images >= 5MB are now automatically resized to < 4.9MB using ImageMagick before being sent to the LLM. This prevents API errors from oversized images. - Uses iterative quality/scale reduction to find optimal size - Converts to JPEG for better compression - Shows original and resized size in terminal output (e.g., '6.2 MB → 4.1 MB (resized)') - Falls back to original if ImageMagick fails or isn't available	2026-01-16 21:09:38 +05:30
Dhanji R. Prasanna	6bd9c51e8e	feat: shell output pagination and optimized read_file with seek - Shell outputs > 8KB are truncated to first 500 chars - Full output saved to .g3/sessions/<session_id>/tools/shell_stdout_<id>.txt - LLM can use read_file with start/end to paginate through large outputs - read_file now uses seek() for O(1) random access instead of reading entire file - UTF-8 safe: reads extra bytes at boundaries to find valid char positions - Falls back to lossy conversion for binary files (no panics) Files changed: - paths.rs: get_tools_output_dir(), generate_short_id() - shell.rs: truncate_large_output() integration - file_ops.rs: seek-based read_file_range() helper - New test: read_file_utf8_test.rs	2026-01-16 09:16:16 +05:30
Dhanji R. Prasanna	38828c7757	Clean up tool output formatting - Shell: "✅ Command executed successfully" → "⚡️ ran successfully" - Write file: Remove ✏️ emoji, use plain "wrote N lines \| M chars"	2026-01-14 19:42:54 +05:30
Dhanji R. Prasanna	118935d2da	Remove unused variable total_lines in file_ops.rs	2026-01-13 14:25:17 +05:30
Dhanji R. Prasanna	8dcb7a3dba	feat: add compact styled output for TODO tools TODO tools (todo_read, todo_write) now display with a cleaner, more compact format: - Styled header: " ● todo_read" or " ● todo_write" - Tree-style prefixes for content lines (│ and └) - Checkbox conversion: "- [ ]" → □, "- [x]" → ■ - Dimmed content for visual distinction - No timing footer (cleaner output) Changes: - Add print_todo_compact() method to UiWriter trait - Implement print_todo_compact() in ConsoleUiWriter - Update todo.rs to call print_todo_compact() instead of line-by-line output - Skip tool header, output header, and timing for TODO tools in agent streaming	2026-01-13 10:58:55 +05:30
Dhanji R. Prasanna	f30f145c85	Fix UTF-8 panics and inconsistent retry logic - Fix 7 UTF-8 byte slicing panics that crash on multi-byte characters: - acd.rs: extract_topic_from_text() [..50] slice - streaming.rs: log_stream_error() [..500] slice - tools/acd.rs: rehydrate message truncation [..2000] slice - history.rs: git commit message truncation [..69] slice - planner.rs: commit summary/description truncation [..69] slices - llm.rs: requirements summary line truncation [..117] slice - All now use chars().count() and chars().take(N).collect() for UTF-8 safe truncation - Fix inconsistent retry logic in task_execution.rs: - Previously only retried on Timeout errors - Now retries on ALL recoverable errors (rate limits, network, server errors, model busy, token limits, context length) - Added error-specific base delays (rate limit: 5s, server: 2s, etc.) - Added exponential backoff with ±20% jitter - Consistent with autonomous mode retry behavior	2026-01-13 05:49:45 +05:30
Dhanji R. Prasanna	d508ddd508	Move project memory from .g3/ to analysis/ for version control Project memory is now stored at analysis/memory.md instead of .g3/memory.md. This change enables: - Shared memory across git worktrees (studio agent sessions) - Version-controlled memory that persists across clones - Memory changes tracked in git history and reviewable in PRs Changes: - crates/g3-core/src/tools/memory.rs: Update get_memory_path() to use analysis/ - crates/g3-cli/src/project_files.rs: Update read_project_memory() path - crates/g3-core/src/prompts.rs: Update documentation references (2 occurrences) - analysis/memory.md: Add memory file (copied from .g3/memory.md)	2026-01-12 10:20:33 +05:30
Dhanji R. Prasanna	f415dbb84b	Fix ACD turn summary loss and add /dump command ACD (Aggressive Context Dehydration) fixes: - Fixed dehydrate_context() to extract turn summary from context window instead of using the passed-in final_response (which contained only the timing footer, not the actual LLM response) - Removed final_response parameter from dehydrate_context() since it now self-extracts the last assistant message as the summary - This ensures the actual turn summary is preserved after dehydration, not just the timing footer New /dump command: - Added /dump command to dump entire context window to tmp/ for debugging - Shows message index, role, kind, content length, and full content - Available in both console and machine modes UTF-8 safety: - Fixed truncate_to_word_boundary() to use character indices instead of byte indices, preventing panics on multi-byte UTF-8 characters - Added UTF-8 string slicing guidance to AGENTS.md Agent: g3	2026-01-12 05:13:02 +05:30
Dhanji R. Prasanna	ac17b95b24	fix(read_file): clamp end position instead of erroring when it exceeds file length When read_file is called with an end position beyond the file length, instead of returning an error that forces a retry, now clamps to the actual file length and returns the content with an informative message. This eliminates wasteful retry cycles where the LLM had to make a second request with the corrected end position.	2026-01-12 05:11:09 +05:30
Dhanji R. Prasanna	da63e79a13	Move read_file metadata to end of output Change read_file output format so the "🔍 N lines read" appears as the last line after the file content, not before it. This keeps the output cleaner with just one metadata line at the end.	2026-01-11 19:56:23 +05:30
Dhanji R. Prasanna	ed1c31dd70	Improve tool output formatting 1. str_replace: Show insertion/deletion counts with colors "✅ +N insertions \| -M deletions" (green/red) 2. write_file: Compact format with human-readable sizes "✅ wrote N lines \| Xk chars" 3. read_file: Cleaner format "🔍 N lines read" instead of "📄 File content (N lines)" 4. webdriver_quit: Show correct driver name (safaridriver vs chromedriver) 5. read_file: When start position exceeds file length, read last 100 chars with explanation instead of failing 6. shell: Remove redundant "Command failed:" prefix from error messages	2026-01-11 19:52:00 +05:30
Dhanji R. Prasanna	874be7b459	refactor(core): collapse nested if statements per clippy Collapsed nested if statements that check related conditions into single conditions using &&. This improves readability by making the logical relationship between conditions explicit. Files changed: - feedback_extraction.rs: 3 instances of tool_use/final_output checks - tools/todo.rs: 1 instance of todo completion check Agent: fowler	2026-01-11 16:21:33 +05:30
Dhanji R. Prasanna	e731bc8217	Make remember tool instructions more imperative in system prompts - Change 'call remember' to 'you MUST call remember' in native prompt - Change 'IF you discovered' to 'ALWAYS...when you discovered' - Add explicit list of trigger tools (code_search, rg, grep, find, read_file) - Add reminder to Response Guidelines section - Add remember tool and Project Memory section to non-native prompt - Remove redundant console output from remember tool - Fix test compilation errors (missing summary parameter, temporary borrow)	2026-01-11 06:49:45 +08:00
Dhanji R. Prasanna	1090e30d6c	Simplify system prompt: remove coding style and parallel tool call sections - Remove IMPORTANT FOR CODING section (~1,500 chars of coding guidelines) - Remove <use_parallel_tool_calls> block (~500 chars) - Remove unused const_format dependency from g3-core - Simplify get_system_prompt_for_native() to just return base prompt - Response Guidelines now cleanly ends the static prompt Prompt reduced from ~8,500 to ~6,500 characters.	2026-01-11 06:35:18 +08:00
Dhanji R. Prasanna	86709834e2	Improve research tool error reporting for scout agent failures When the scout agent fails (e.g., context window exhaustion), now: - Captures both stdout and stderr from the scout process - Detects context window exhaustion errors with specific patterns - Provides detailed, actionable error messages to the user - Shows suggestions for how to work around the issue - Includes technical details (exit code, error output) for debugging Handles two failure modes: 1. Scout agent exits with non-zero status 2. Scout agent exits successfully but doesn't produce valid report markers Both cases now surface clear error messages instead of cryptic failures.	2026-01-10 20:50:43 +11:00
Dhanji R. Prasanna	60aeb67c56	Add stealth mode for Chrome headless to evade bot detection Implements comprehensive anti-detection measures: - Override navigator.webdriver to return undefined - Inject fake chrome.runtime, chrome.loadTimes, chrome.csi objects - Add realistic plugins and mimeTypes arrays - Patch permissions API to hide automation - Set realistic navigator properties (languages, hardwareConcurrency, deviceMemory) - Remove ChromeDriver-specific window properties (cdc_*) - Patch Function.prototype.toString to hide modifications - Add Chrome flags: --disable-blink-features=AutomationControlled - Set realistic user-agent without HeadlessChrome identifier - Exclude 'enable-automation' switch Tested against bot detection sites: - bot.sannysoft.com: All major tests pass - Search engines: Works with DuckDuckGo, Yahoo, Brave, Startpage - Still detected by: Google reCAPTCHA, Cloudflare Turnstile, Bing	2026-01-10 20:34:14 +11:00
Dhanji R. Prasanna	6be0a03c4c	Fix timing footer being saved to context window The timing footer (e.g., ⏱️ 19.4s \| 💭 4.7s) was being saved to the conversation history as a separate assistant message. This happened because stream_completion_with_tools returns the timing footer in TaskResult.response for display, but the caller was also saving it to context. Fix: Strip the timing footer (identified by \n\n⏱️) before saving to context window. The timing footer remains display-only. Also includes: - Research tool blank line fix: only add visual separator for research tool output, not all tools - Research tool webdriver propagation: pass parent's webdriver browser choice (Safari vs Chrome headless) to scout subprocess	2026-01-10 15:55:59 +11:00
Dhanji R. Prasanna	68c9135913	Fix research tool UI: remove duplicate header, add footer spacing, remove spinner, widen command display - Remove duplicate tool header (lib.rs already prints it) - Add newline before timing footer for visual separation - Remove spinner animation (incompatible with update_tool_output_line) - Change shell command format to " > `cmd` ..." with 60 char width	2026-01-10 15:20:40 +11:00
Dhanji R. Prasanna	0aa1287ca6	Remove final_output tool and improve scout report handback final_output removal: - Remove final_output from tool definitions and dispatch - Update system prompts to request summaries as regular text - Remove final_output_called field from StreamingState - Update auto_continue tests to remove final_output_called parameter - Remove final_output test from tool_execution_test.rs - Update planner and flock prompts to not reference final_output - Keep backwards-compat code in feedback_extraction.rs and task_result.rs Scout report handback: - Change from file-based to delimiter-based report extraction - Scout outputs report between ---SCOUT_REPORT_START/END--- markers - Research tool extracts content between markers, strips ANSI codes - Add comprehensive tests for extraction and ANSI stripping 657 tests pass.	2026-01-10 13:43:04 +11:00
Dhanji R. Prasanna	cab2fb187a	Stream scout agent output to CLI during research The research tool now streams the underlying scout agent's output to the CLI in real-time for visual indication of progress. This output is displayed but not added to the conversation context.	2026-01-09 20:39:53 +11:00
Dhanji R. Prasanna	33e5705fc3	Add research tool for web-based research via scout agent New tool that spawns a scout agent to perform web research and return a structured research brief. The scout agent uses webdriver to browse the web and returns a decision-ready report. Changes: - Added 'research' tool definition (12 core tools total) - Added research tool dispatch in tool_dispatch.rs - Created tools/research.rs implementation: - Spawns 'g3 --agent scout <query>' as subprocess - Captures stdout and extracts last line (report file path) - Reads and returns the report file contents - Added exclude_research flag to ToolConfig - Scout agent (agent_name == 'scout') does NOT have access to research tool to prevent infinite recursion - Updated system prompts to describe when to use research tool - Added scout.md agent prompt with research brief output contract The research tool is preferred for complex research tasks (APIs, SDKs, libraries, approaches, bugs). WebDriver can still be used directly for simple lookups or fine-grained control.	2026-01-09 15:59:19 +11:00
Dhanji R. Prasanna	777191b3cb	Remove final_output tool - let summaries stream naturally - Remove final_output from tool definitions, dispatch, and misc tools - Update system prompts to request summaries as regular markdown text - Remove print_final_output from UiWriter trait and all implementations - Remove final_output handling from agent core logic - Rename final_output_summary → summary in session continuation - Delete final_output test files - Update tool count tests (12→11, 27→26) This allows LLM summaries to stream through the markdown formatter for a more natural, responsive user experience instead of buffering everything into a tool call.	2026-01-09 14:57:24 +11:00
Dhanji R. Prasanna	67be0f20c7	fix: remove allow_multiple_tool_calls config and simplify tool execution flow This fixes a bug where the agent would stop responding abruptly without calling final_output. The root cause was the allow_multiple_tool_calls config option (default: false) which caused the agent to break out of the streaming loop mid-stream after executing the first tool, losing any subsequent content. Changes: - Remove allow_multiple_tool_calls config option entirely - Always process all tool calls without breaking mid-stream - Simplify system prompt generation (no longer needs boolean param) - Let the stream complete fully before continuing to next iteration - Change find_last_tool_call_start to find_first_tool_call_start - Remove parser.reset() call on duplicate detection Benefits: - Simpler logic with less conditional branching - No lost content after tool calls - Consistent behavior for all users - Reduced config complexity	2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna	267ef00848	refactor: extract session helper in webdriver.rs to reduce boilerplate Agent: carmack Add get_session() helper function that: - Checks if webdriver is enabled - Acquires the session read lock - Returns the cloned session or an error message Refactored 12 webdriver tool functions to use this helper: - execute_webdriver_navigate - execute_webdriver_get_url - execute_webdriver_get_title - execute_webdriver_find_element - execute_webdriver_find_elements - execute_webdriver_click - execute_webdriver_send_keys - execute_webdriver_execute_script - execute_webdriver_get_page_source - execute_webdriver_screenshot - execute_webdriver_back - execute_webdriver_forward - execute_webdriver_refresh Each function previously had ~10 lines of identical boilerplate. Now reduced to 4 lines using the helper. Net reduction: 68 lines (678 -> 610) All tests pass. Behavior unchanged.	2026-01-08 13:05:44 +11:00
Dhanji R. Prasanna	bb63050779	refactor: improve readability of streaming and file ops code Agent: carmack databricks.rs: - Extract ToolCallAccumulator struct to replace opaque (String, String, String) tuple - Add decode_utf8_streaming() helper for cleaner UTF-8 handling - Add is_incomplete_json_error() helper for JSON parse error detection - Add make_final_chunk() helper to reduce duplication - Add finalize_tool_calls() to convert accumulators to final format - Refactor parse_streaming_response from ~270 lines to ~100 lines - Reduce nesting depth from 8+ levels to 4 levels - Use early returns and let-else for cleaner control flow file_ops.rs: - Replace repetitive if-let chains with declarative PATH_CONTENT_KEYS table - Use match expression instead of nested if-else - Reduce extract_path_and_content from 44 lines to 20 lines All tests pass. Behavior unchanged.	2026-01-07 12:39:05 +11:00
Dhanji R. Prasanna	386176899e	Remove vision tools (except take_screenshot) and macax tools Vision tools removed: - extract_text (OCR from image files) - extract_text_with_boxes (OCR with bounding boxes) - vision_find_text (find text in app windows) - vision_click_text (find and click on text) - vision_click_near_text (click near text labels) macax tools removed: - macax_list_apps - macax_get_frontmost_app - macax_activate_app - macax_press_key - macax_type_text The LLM can now read images directly via read_image tool. take_screenshot is retained for capturing application windows. Files deleted: - crates/g3-core/src/tools/vision.rs - crates/g3-core/src/tools/macax.rs - docs/macax-tools.md Updated tool counts: 12 core + 15 webdriver = 27 total	2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna	29e263ac49	Fix Unicode space handling in macOS screenshot filenames macOS uses U+202F (Narrow No-Break Space) in screenshot filenames between the time and am/pm. When users type or paste these paths, they use regular spaces, causing file-not-found errors. Changes: - Add resolve_path_with_unicode_fallback() to try U+202F variants - Add resolve_paths_in_shell_command() for shell command paths - Apply fix to read_file, read_image, and shell tools - Fix read_image prompt docs: file_path -> file_paths (array) - Add 6 unit tests for Unicode space normalization	2026-01-03 17:17:08 +11:00
Dhanji R. Prasanna	595ad6ad21	agent mode resumption	2026-01-03 14:50:08 +11:00

38 Commits