alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	cfd5d69cce	refactor: auto-enable auto-memory in agent mode Simplify auto-memory by always enabling it in agent mode instead of requiring the --auto-memory flag. This makes sense because: - Agent mode is non-interactive, so blocking is acceptable - Agents benefit from automatically saving discoveries to memory - Reduces flag complexity for users The --auto-memory flag still works for other modes if desired.	2026-01-11 15:56:27 +05:30
Dhanji R. Prasanna	1575cafc4b	fix: add --auto-memory support to agent mode The --auto-memory flag was not being passed to run_agent_mode() and send_auto_memory_reminder() was not being called after agent task execution. Changes: - Pass auto_memory parameter to run_agent_mode() - Add auto_memory parameter to run_agent_mode() function signature - Call agent.set_auto_memory(true) when flag is enabled - Call send_auto_memory_reminder() after execute_task() in agent mode	2026-01-11 08:03:46 +08:00
Dhanji R. Prasanna	280ae1fcbb	feat: add --auto-memory flag to prompt LLM to save discoveries Adds a new --auto-memory CLI flag that automatically sends a reminder to the LLM after each turn where tools were called, prompting it to call the remember tool if it discovered any key code locations. Changes: - Add auto_memory field and set_auto_memory() method to Agent - Add tool_calls_this_turn tracking in execute_tool_in_dir() - Add send_auto_memory_reminder() that sends reminder after tool use - Add --auto-memory CLI flag and wire it up in console/machine modes - Call send_auto_memory_reminder() in single-shot and interactive modes - Add visible status messages for auto-memory actions Fixes bug where tool calls were not being tracked when execute_tool_in_dir was called directly with working_dir=None.	2026-01-11 08:00:51 +08:00
Dhanji R. Prasanna	39918cf281	fix: process bold/italic/code formatting inside markdown headers The format_header() function was not calling format_inline_content() to process inline formatting like bold, italic, and `code` within headers. This caused raw markdown markers to appear in output. Added 4 tests to verify the fix: - test_bold_inside_header - test_italic_inside_header - test_code_inside_header - test_mixed_formatting_inside_header	2026-01-11 08:00:34 +08:00
Dhanji R. Prasanna	fc9a2f835a	Fix streaming markdown code fence detection bug The code fence (```) was not being properly detected during streaming, causing it to be rendered as inline code instead of a code block. Root cause: When buffering a code fence after seeing ```, the code was returning early for ALL characters including newlines. This meant handle_newline() was never called and block_state was never set to BlockState::CodeBlock. Fixes: - Don't return early for newlines when buffering code fence, allow them to fall through to handle_newline() - Support indented code fences (up to 3 spaces per CommonMark spec) by using trim_start() when checking for ``` at line start	2026-01-11 07:42:02 +08:00
Dhanji R. Prasanna	e731bc8217	Make remember tool instructions more imperative in system prompts - Change 'call remember' to 'you MUST call remember' in native prompt - Change 'IF you discovered' to 'ALWAYS...when you discovered' - Add explicit list of trigger tools (code_search, rg, grep, find, read_file) - Add reminder to Response Guidelines section - Add remember tool and Project Memory section to non-native prompt - Remove redundant console output from remember tool - Fix test compilation errors (missing summary parameter, temporary borrow)	2026-01-11 06:49:45 +08:00
Dhanji R. Prasanna	33c1aba86e	Show human-readable descriptions in /resume session list - Add description field to SessionContinuation struct - Extract first user message (truncated to ~60 chars at word boundary) - Display as quoted text instead of session ID hash - Fall back to session ID if no description available Example: [2 hours ago] 'when I call /resume it only shows me 2 sessions...'	2026-01-11 06:22:20 +08:00
Dhanji R. Prasanna	3fcef587e8	Fix /resume to show all sessions and use human-readable timestamps - Change run_autonomous to return Agent instead of () so session continuation is properly saved in accumulative mode - Update format_session_time to show relative times ("2 hours ago", "yesterday") for recent sessions and dates for older ones - Handle Ctrl+C cancellation gracefully with informative message	2026-01-11 06:13:27 +08:00
Dhanji R. Prasanna	8926775acb	Add session continuation symlink fix and /resume command Fix session detection: - Add save_session_continuation() calls at all session exit points - Sessions now properly create .g3/session symlink for resume detection - Fixes issue where g3 wasn't offering to resume previous sessions Add /resume command: - New list_sessions_for_directory() to scan available sessions - New switch_to_session() method to safely switch between sessions - Shows numbered list with timestamps, context %, and TODO status - Saves current session before switching (can be resumed later) - Restores full context if <80% used, otherwise uses summary - Machine mode supports /resume and /resume <number> Documentation: - Add /clear and /resume to CONTROL_COMMANDS.md - Update /help output with new commands	2026-01-11 05:30:58 +08:00
Dhanji R. Prasanna	9bef7753bf	Add Chrome headless diagnostic tool Runs automatically when --chrome-headless flag is used, checking: - ChromeDriver installation and PATH - Chrome/Chromium installation - Chrome and ChromeDriver version compatibility - config.toml chrome_binary setting - Chrome for Testing installation - ChromeDriver executable permissions (macOS quarantine) Displays a detailed report with: - Summary of detected versions and paths - Pass/warning/error status for each check - Specific fix suggestions for any issues found Users can then ask g3 to help fix any detected issues.	2026-01-10 20:44:23 +11:00
Dhanji R. Prasanna	ea582766ba	chrome-headless falg	2026-01-10 16:14:14 +11:00
Dhanji R. Prasanna	6be0a03c4c	Fix timing footer being saved to context window The timing footer (e.g., ⏱️ 19.4s \| 💭 4.7s) was being saved to the conversation history as a separate assistant message. This happened because stream_completion_with_tools returns the timing footer in TaskResult.response for display, but the caller was also saving it to context. Fix: Strip the timing footer (identified by \n\n⏱️) before saving to context window. The timing footer remains display-only. Also includes: - Research tool blank line fix: only add visual separator for research tool output, not all tools - Research tool webdriver propagation: pass parent's webdriver browser choice (Safari vs Chrome headless) to scout subprocess	2026-01-10 15:55:59 +11:00
Dhanji R. Prasanna	68c9135913	Fix research tool UI: remove duplicate header, add footer spacing, remove spinner, widen command display - Remove duplicate tool header (lib.rs already prints it) - Add newline before timing footer for visual separation - Remove spinner animation (incompatible with update_tool_output_line) - Change shell command format to " > `cmd` ..." with 60 char width	2026-01-10 15:20:40 +11:00
Dhanji R. Prasanna	0aa1287ca6	Remove final_output tool and improve scout report handback final_output removal: - Remove final_output from tool definitions and dispatch - Update system prompts to request summaries as regular text - Remove final_output_called field from StreamingState - Update auto_continue tests to remove final_output_called parameter - Remove final_output test from tool_execution_test.rs - Update planner and flock prompts to not reference final_output - Keep backwards-compat code in feedback_extraction.rs and task_result.rs Scout report handback: - Change from file-based to delimiter-based report extraction - Scout outputs report between ---SCOUT_REPORT_START/END--- markers - Research tool extracts content between markers, strips ANSI codes - Add comprehensive tests for extraction and ANSI stripping 657 tests pass.	2026-01-10 13:43:04 +11:00
Dhanji R. Prasanna	c88ffa2431	Remove final_output tool, improve scout agent - Remove final_output tool to allow LLM responses to stream naturally - Update system prompts to request summaries instead of tool calls - Rename final_output_summary to summary in session continuation - Update tool count tests (12→11 core tools, 27→26 total) - Delete obsolete final_output tests Scout agent improvements: - Simplify WebDriver usage instructions - Prefer DuckDuckGo/Brave/Bing over Google - Support passing task directly to agent mode - Suppress completion message for scout (needs clean output for research tool)	2026-01-09 20:30:00 +11:00
Dhanji R. Prasanna	e301075666	Fix panic on multi-byte chars in filter_json buffer truncation The buffer truncation code was slicing at a raw byte offset which could land in the middle of a multi-byte character (like emojis), causing a panic. Fixed by using char_indices() to find valid character boundaries. Also added stop_reason field to CompletionChunk initializers in tests to complete the stop_reason feature addition. - Fix byte boundary panic in filter_json.rs line 327 - Add test for multi-byte character handling - Update test files with missing stop_reason field	2026-01-09 15:20:57 +11:00
Dhanji R. Prasanna	777191b3cb	Remove final_output tool - let summaries stream naturally - Remove final_output from tool definitions, dispatch, and misc tools - Update system prompts to request summaries as regular markdown text - Remove print_final_output from UiWriter trait and all implementations - Remove final_output handling from agent core logic - Rename final_output_summary → summary in session continuation - Delete final_output test files - Update tool count tests (12→11, 27→26) This allows LLM summaries to stream through the markdown formatter for a more natural, responsive user experience instead of buffering everything into a tool call.	2026-01-09 14:57:24 +11:00
Dhanji R. Prasanna	d96d8c1d90	Rewrite JSON tool call filter with clean state machine Fixes bug where JSON tool calls were printed as text due to chunking issues. Changes: - Complete rewrite of filter_json.rs with 3-state machine: - Streaming: normal pass-through, watches for newline + whitespace + { - Buffering: confirms/denies tool pattern with ~20 char buffer - Suppressing: string-aware brace counting until balanced - Character-by-character processing eliminates chunk boundary issues - Proper handling of } inside JSON strings (was causing premature exit) - Detects truncated JSON followed by complete JSON (LLM retry case) - Removed regex dependency, simpler pattern matching - Added 59 stress tests covering malformed JSON, partial patterns, streaming edge cases, adversarial inputs, and real-world patterns All 86 filter_json tests pass.	2026-01-09 14:05:11 +11:00
Dhanji R. Prasanna	49b27b0cbc	fix: truncate long lines in streaming tool output to prevent terminal wrapping When shell commands output very long lines (e.g., JSON content from tail -c 10000), the lines would wrap in the terminal. The cursor-up escape code (\x1b[1A) only moves up one visual line, not the entire wrapped content, causing the display to fill with uncleared text. This fix truncates lines to 120 characters in update_tool_output_line() before displaying them, preventing the wrapping issue.	2026-01-09 13:35:58 +11:00
Dhanji R. Prasanna	67be0f20c7	fix: remove allow_multiple_tool_calls config and simplify tool execution flow This fixes a bug where the agent would stop responding abruptly without calling final_output. The root cause was the allow_multiple_tool_calls config option (default: false) which caused the agent to break out of the streaming loop mid-stream after executing the first tool, losing any subsequent content. Changes: - Remove allow_multiple_tool_calls config option entirely - Always process all tool calls without breaking mid-stream - Simplify system prompt generation (no longer needs boolean param) - Let the stream complete fully before continuing to next iteration - Change find_last_tool_call_start to find_first_tool_call_start - Remove parser.reset() call on duplicate detection Benefits: - Simpler logic with less conditional branching - No lost content after tool calls - Consistent behavior for all users - Reduced config complexity	2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna	a72d5a650a	Fix two markdown formatting bugs Bug 1: Inline code after list bullets not detected - After emitting a list bullet, at_line_start was not set to false - This caused the next backtick to be treated as a potential code fence - Fixed by setting at_line_start = false after emitting bullet Bug 2: Code block closing on indented backticks - Code blocks containing indented ``` (4+ spaces) were closing prematurely - The .trim() check was too permissive - Fixed by only allowing closing fence with <= 3 spaces indent (CommonMark spec) Added tests for both edge cases.	2026-01-08 20:50:26 +11:00
Dhanji R. Prasanna	19a804e0be	Add syntax highlighting for Racket, Elisp, and Scheme Add language alias mapping in highlight_code() to map: - racket, rkt -> lisp - elisp, emacs-lisp -> lisp - scheme -> lisp - common-lisp, cl -> lisp - shell, sh, zsh, dockerfile -> bash Syntect's built-in Lisp syntax handles all Lisp-family languages well. Added test to verify the aliases work correctly.	2026-01-08 20:35:34 +11:00
Dhanji R. Prasanna	df706308ca	Unify final_output rendering with streaming markdown formatter Replace the separate syntax_highlight module with the streaming markdown formatter for final_output rendering. This: - Removes special buffered rendering logic for final_output - Uses the same StreamingMarkdownFormatter used for agent responses - Removes the spinner animation (content renders immediately) - Deletes the now-unused syntax_highlight.rs module - Updates test to use the streaming formatter Benefits: - Consistent rendering across all markdown output - Less code to maintain (removed ~250 lines) - Same syntax highlighting via syntect (already in streaming formatter)	2026-01-08 20:30:44 +11:00
Dhanji R. Prasanna	347513b04c	Add comprehensive stress tests for streaming markdown formatter Add 10 stress tests covering: - Nested formatting (bold in italic, italic in bold) - Empty/minimal content edge cases - Escape sequences and special characters - Lists with complex inline formatting - Links with various content types - Tables with formatting in cells - Code blocks (should not format contents) - Mixed block elements (headers, quotes, rules) - Nested lists (3+ levels, mixed types) - Pathological/adversarial inputs (unbalanced delimiters, unicode, long lines) All 45 tests pass.	2026-01-08 20:27:28 +11:00
Dhanji R. Prasanna	5bfaee8dd5	use consistent naming for compaction	2026-01-08 12:54:03 +11:00
Dhanji R. Prasanna	4e7aca50fa	feat: royal blue tool names in agent mode + fix README heading display - Add set_agent_mode() to UiWriter trait for visual mode differentiation - ConsoleUiWriter uses royal blue (ANSI 256 color 69) for tool names in agent mode - Fix extract_readme_heading() to search only README section of combined content (was incorrectly showing AGENTS.md heading instead of README heading)	2026-01-07 11:37:51 +11:00
Dhanji R. Prasanna	1056b4193b	chore(g3-cli): remove orphaned retro_tui and tui modules These files were not referenced anywhere in the codebase and appear to be leftover from a previous TUI implementation that was abandoned. Removed: - crates/g3-cli/src/retro_tui.rs (62KB) - crates/g3-cli/src/tui.rs (6KB) Agent: fowler	2026-01-07 10:39:42 +11:00
Dhanji R. Prasanna	c4ae85de72	Add --new-session flag to skip session resumption in agent mode Adds a new CLI flag that allows users to force a new session when running in agent mode, bypassing the automatic detection and resumption of incomplete sessions. Usage: g3 --agent my-agent --new-session	2026-01-07 09:59:15 +11:00
Dhanji R. Prasanna	5d20da2609	Add 54 integration tests for CLI, tools, and message serialization New test files: - crates/g3-cli/tests/cli_integration_test.rs (14 tests) Blackbox CLI tests: help/version flags, argument validation, conflicting modes, flock mode requirements - crates/g3-core/tests/tool_execution_test.rs (20 tests) Tool call structure tests and unified diff application: read_file, write_file, str_replace, shell, background_process, todo, final_output, code_search, take_screenshot - crates/g3-providers/tests/message_serialization_test.rs (20 tests) Round-trip serialization tests for Message, MessageRole, CacheControl, and Tool types. Covers Unicode, special chars, and edge cases. All tests follow blackbox/integration-first principles with documentation of what they protect and intentionally do not assert.	2026-01-07 09:23:34 +11:00
Dhanji R. Prasanna	386176899e	Remove vision tools (except take_screenshot) and macax tools Vision tools removed: - extract_text (OCR from image files) - extract_text_with_boxes (OCR with bounding boxes) - vision_find_text (find text in app windows) - vision_click_text (find and click on text) - vision_click_near_text (click near text labels) macax tools removed: - macax_list_apps - macax_get_frontmost_app - macax_activate_app - macax_press_key - macax_type_text The LLM can now read images directly via read_image tool. take_screenshot is retained for capturing application windows. Files deleted: - crates/g3-core/src/tools/vision.rs - crates/g3-core/src/tools/macax.rs - docs/macax-tools.md Updated tool counts: 12 core + 15 webdriver = 27 total	2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna	76bfb77f84	further fowler fixes and session fixes	2026-01-03 15:47:04 +11:00
Dhanji R. Prasanna	595ad6ad21	agent mode resumption	2026-01-03 14:50:08 +11:00
Dhanji R. Prasanna	8d071d5eed	fix: fowler agent now respects --workspace flag and reads project docs - Fixed run_agent_mode to call std::env::set_current_dir with workspace_dir - Updated fowler.md to read README.md and AGENTS.md as part of Triage & Understanding step	2025-12-26 15:24:20 +11:00
Dhanji R. Prasanna	7e59e181f7	context line ui	2025-12-26 12:58:13 +11:00
Dhanji R. Prasanna	258f9878ff	style: use ◉ symbol for token count in timing footer Changes '227tk \| 48% ctx' to '227 ◉ \| 48%' for a cleaner look.	2025-12-25 18:40:17 +11:00
Dhanji R. Prasanna	cd64ebbf87	Add tokens consumed and context percentage to per-tool timing footer The per-tool timing line now shows: - Tokens delta (tokens added to context by this tool call) - Context window usage percentage Example: └─ ⚡️ 1ms 523tk \| 49% ctx Changes: - Updated UiWriter trait print_tool_timing signature - Track tokens before/after adding tool messages to calculate delta - Updated ConsoleUiWriter, MachineUiWriter, PlannerUiWriter, and test mocks	2025-12-24 15:44:19 +11:00
Dhanji R. Prasanna	923def0ab2	Convert all INFO logs to DEBUG to reduce CLI noise Converted ~77 info! macro calls to debug! across the codebase to prevent log messages from interrupting the CLI experience during normal operation. Users can still see these logs by setting RUST_LOG=debug if needed. Affected crates: - g3-cli - g3-computer-control - g3-console - g3-core - g3-ensembles - g3-execution - g3-providers	2025-12-22 16:27:35 +11:00
Dhanji R. Prasanna	38fcaaf449	Add edge case tests for filter_json_tool_calls - test_brace_inside_json_string_value: braces inside JSON strings - test_multiple_braces_in_string: multiple braces in string values - test_escaped_quotes_with_braces: escaped quotes with braces - test_brace_in_string_across_chunks: streaming with braces in strings - test_complex_nested_with_string_braces: nested JSON with string braces - test_str_replace_with_diff_content: real-world str_replace case - test_tool_call_after_other_content: tool call after other output - test_tool_call_with_nested_tool_pattern_in_string: nested patterns All 27 tests pass.	2025-12-22 13:30:57 +11:00
Dhanji R. Prasanna	3bc254962c	clean up filter_json a bit (more to come)	2025-12-22 12:03:09 +11:00
Dhanji R. Prasanna	01a5284d6d	Move fixed_filter_json from g3-core to g3-cli Properly separates UI display concern from core library: - fixed_filter_json module now lives in g3-cli (UI layer) - UiWriter trait gains filter_json_tool_calls() and reset_json_filter() methods - g3-core delegates filtering to UI layer via trait methods - Different UiWriter implementations can choose their own filtering behavior - ConsoleUiWriter filters JSON tool calls for clean terminal display - MachineUiWriter/NullUiWriter use default pass-through Benefits: - Proper separation of concerns - Core stays clean without display-specific logic - Testability - filter can be tested independently in g3-cli	2025-12-22 10:32:21 +11:00
Dhanji R. Prasanna	fbf31e5f68	Fix continuation errors: auto-continue when final_output not called - Add final_output_called flag to track if LLM properly completed - Auto-continue with prompt if tools executed but final_output missing - Remove unused last_action_was_tool and any_text_response variables - Simplifies previous complex incomplete response detection logic	2025-12-20 15:32:12 +11:00
Dhanji R. Prasanna	e771382bd0	agent mode + fowler bot	2025-12-19 16:14:03 +11:00
Dhanji R. Prasanna	faa6512b1f	Revert to Safari as default WebDriver browser Chrome headless has too many issues: - Session creation hangs when Chrome is already running - Cloudflare and other bot protection blocks headless browsers - Version mismatch issues between Chrome and ChromeDriver Safari is more reliable for web automation on macOS. Chrome headless is still available via --chrome-headless flag.	2025-12-16 12:36:18 +11:00
Dhanji R. Prasanna	3d1b86d24b	Make Chrome headless the default WebDriver browser - Add --safari flag to CLI for explicitly choosing Safari - Update --chrome-headless flag description to indicate it's the default - Update README to reflect Chrome headless as default - Remove broken link to non-existent docs/webdriver-setup.md - Add Safari flag handling in all webdriver config locations The config already had ChromeHeadless as the default, this commit updates the CLI and documentation to match.	2025-12-15 16:51:42 +11:00
Jochen	87bceba54f	Fix planner UI whitespace and workspace logs directory Resolve two critical issues in planner mode that persisted through multiple fix attempts: 1. Remove excessive whitespace between tool call displays by replacing direct println!() calls with ui_writer methods and eliminating redundant newlines in agent response streaming. 2. Ensure all log files (errors, sessions, tool calls, context dumps) are written to <workspace>/logs instead of codepath by properly initializing G3_WORKSPACE_PATH from --workspace argument.	2025-12-10 16:18:49 +11:00
Jochen	75aa2d983e	Refine planner mode UI and error handling Improve planner mode user experience with better error reporting, cleaner tool output, and consistent log file placement. - Propagate and display classified LLM errors to users with appropriate icons and context - Display tool calls on single lines with truncated arguments - Show LLM text responses without overwriting via UiWriter - Ensure all logs write to workspace/logs directory consistently - Set G3_WORKSPACE_PATH early in planning mode initialization	2025-12-09 22:44:00 +11:00
Jochen	ff8b3e7c7b	Implement planning mode	2025-12-09 17:03:53 +11:00
Dhanji R. Prasanna	678403da35	add a force thinnify cmd	2025-12-05 15:32:13 +11:00
Jochen	0327a6dfdf	make sure coach feedback is extracted.	2025-12-02 22:00:58 +11:00
Jochen	928f2bfa9d	actually record coach feedback and use it	2025-12-02 21:23:50 +11:00

1 2 3 4 5

240 Commits