alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	e2445a5d22	refactor(g3-core): extract duplicate detection helper and consolidate thinning - Extract check_duplicate_in_previous_message() helper to reduce nesting from 6+ levels to 2 levels in stream_completion_with_tools - Create do_thin_context() and do_thin_context_all() helpers to centralize context thinning with event tracking - Use provider_config::parse_provider_ref() in additional call sites - All 295 tests pass This continues the refactoring to eliminate code-path aliasing and reduce cyclomatic complexity in the Agent implementation.	2026-01-07 08:45:51 +11:00
Dhanji R. Prasanna	a87928661d	Remove overly broad .json from .gitignore The blanket .json ignore is not canonical for Rust projects. JSON files that need ignoring are already covered by: - .g3/ for session logs - logs/ for error logs - .build for Swift build artifacts	2026-01-06 13:54:27 +11:00
Dhanji R. Prasanna	2d8e733820	Add dependency graph JSON data Add exception to .gitignore for analysis/deps/graph.json	2026-01-06 13:24:01 +11:00
Dhanji R. Prasanna	6d6aed563d	Add structural dependency analysis artifacts - graph.json: Canonical dependency graph (10 crates, 16 edges, 76 files) - graph.summary.md: One-page overview with fan-in/fan-out rankings - sccs.md: Strongly Connected Components analysis (no cycles) - layers.observed.md: 5-layer architecture diagram - hotspots.md: Coupling hotspots (g3-config, g3-cli) - limitations.md: Extraction limitations and validity conditions	2026-01-06 13:23:24 +11:00
Dhanji R. Prasanna	764d1bf67e	Add ./tmp/ to .gitignore	2026-01-06 12:50:14 +11:00
Dhanji R. Prasanna	2592fee5d5	Generalize lamport.md examples to be language-agnostic - Changed Rust-specific examples to generic ones: - 'Tool calls must be valid JSON' → 'API responses must be valid JSON' - 'Never block the async runtime' → 'Never block the event loop' - 'Crate/module' → 'Module/package' - 'run cargo test' → 'basic commands'	2026-01-06 12:49:00 +11:00
Dhanji R. Prasanna	e2fffaab94	Slim down AGENTS.md and update lamport.md for machine-specific output AGENTS.md changes: - Removed redundant sections that duplicated README.md: - System Overview (crate table) - File Structure Quick Reference - Testing Strategy - Pointers to Documentation - Architecture Decisions - Kept unique machine-specific sections: - Critical Invariants (merged Performance Constraints) - Recommended Entry Points - Dangerous/Subtle Code Paths - Do's and Don'ts for Automated Changes - Common Incorrect Assumptions - Dependency Analysis Artifacts - Reduced from ~220 lines to ~116 lines lamport.md changes: - Rewrote AGENTS.md section with explicit instructions - Added REQUIRED sections list (5 sections only) - Added DO NOT include list to prevent README duplication - AGENTS.md now points to README for architecture/usage	2026-01-06 12:46:40 +11:00
Dhanji R. Prasanna	6d2cab93f5	Extend euler.md to require AGENTS.md updates The Euler agent must now update AGENTS.md after generating artifacts: - Add/update 'Dependency Analysis Artifacts' section - Table listing each file in analysis/deps/ with one-line descriptions - No findings, metrics, or recommendations in AGENTS.md	2026-01-06 12:35:12 +11:00
Dhanji R. Prasanna	9132c441f1	Remove Key findings section from dependency analysis docs	2026-01-06 12:33:48 +11:00
Dhanji R. Prasanna	d695f10604	Document dependency analysis artifacts in AGENTS.md Added section explaining the analysis/deps/ directory contents: - graph.json: Raw dependency graph data - graph.summary.md: Overview metrics and rankings - sccs.md: Cycle detection results - layers.observed.md: Layer diagrams - hotspots.md: Coupling hotspots - limitations.md: Analysis limitations Includes key findings from the Euler agent's static analysis.	2026-01-06 12:31:17 +11:00
Dhanji R. Prasanna	386176899e	Remove vision tools (except take_screenshot) and macax tools Vision tools removed: - extract_text (OCR from image files) - extract_text_with_boxes (OCR with bounding boxes) - vision_find_text (find text in app windows) - vision_click_text (find and click on text) - vision_click_near_text (click near text labels) macax tools removed: - macax_list_apps - macax_get_frontmost_app - macax_activate_app - macax_press_key - macax_type_text The LLM can now read images directly via read_image tool. take_screenshot is retained for capturing application windows. Files deleted: - crates/g3-core/src/tools/vision.rs - crates/g3-core/src/tools/macax.rs - docs/macax-tools.md Updated tool counts: 12 core + 15 webdriver = 27 total	2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna	29e263ac49	Fix Unicode space handling in macOS screenshot filenames macOS uses U+202F (Narrow No-Break Space) in screenshot filenames between the time and am/pm. When users type or paste these paths, they use regular spaces, causing file-not-found errors. Changes: - Add resolve_path_with_unicode_fallback() to try U+202F variants - Add resolve_paths_in_shell_command() for shell command paths - Apply fix to read_file, read_image, and shell tools - Fix read_image prompt docs: file_path -> file_paths (array) - Add 6 unit tests for Unicode space normalization	2026-01-03 17:17:08 +11:00
Dhanji R. Prasanna	f7e2f38fe9	lamport run	2026-01-03 16:48:30 +11:00
Dhanji R. Prasanna	f4a1bf5e93	fix agent-mode session resumption bug	2026-01-03 16:44:58 +11:00
Dhanji R. Prasanna	76bfb77f84	further fowler fixes and session fixes	2026-01-03 15:47:04 +11:00
Dhanji R. Prasanna	65867e7f96	refactor tools out of lib.rs	2026-01-03 15:06:34 +11:00
Dhanji R. Prasanna	595ad6ad21	agent mode resumption	2026-01-03 14:50:08 +11:00
Dhanji R. Prasanna	016efc1db6	Prevent agent mode from stopping after first TODO phase - Add TODO completion check to final_output tool in autonomous mode only - When incomplete TODO items exist, reject final_output and prompt LLM to continue - Non-autonomous modes (interactive, chat) are unaffected - Add 6 tests verifying behavior in both autonomous and non-autonomous modes Fixes issue where LLM would call final_output after completing first phase, causing agent to stop prematurely instead of continuing with remaining phases.	2025-12-27 12:35:31 +11:00
Dhanji R. Prasanna	8d071d5eed	fix: fowler agent now respects --workspace flag and reads project docs - Fixed run_agent_mode to call std::env::set_current_dir with workspace_dir - Updated fowler.md to read README.md and AGENTS.md as part of Triage & Understanding step	2025-12-26 15:24:20 +11:00
Dhanji R. Prasanna	4c25e43ee4	refactoring	2025-12-26 15:16:12 +11:00
Dhanji R. Prasanna	7e59e181f7	context line ui	2025-12-26 12:58:13 +11:00
Dhanji R. Prasanna	666be4ff40	Fix duplicate tool call handling: move tool_executed flag and reset parser - Move tool_executed = true after duplicate check to prevent auto-continue from triggering when only duplicate tools were detected - Reset parser state when duplicate detected to clear any partial/polluted state from LLM stuttering or example tool calls in markdown blocks	2025-12-26 11:55:57 +11:00
Dhanji R. Prasanna	46611d9e13	Improve read_image output formatting - Add newline after └─ before first image preview - Show only filename (not full path) in info line	2025-12-26 11:36:10 +11:00
Dhanji R. Prasanna	2a4dad2842	Update read_image output with box drawing characters - Print └─ before images to break out of tool output box - Print ┌─ after images to resume tool output box - Remove │ prefix from image preview and info lines - Info line uses single space prefix, dimmed text - Only include error messages in tool result (success info printed via imgcat)	2025-12-26 11:29:33 +11:00
Dhanji R. Prasanna	e688d3b29f	Simplify read_image imgcat output formatting - Remove │ prefix before image preview, use single space instead - Keep info line on its own line with │ prefix - Keep blank line spacing between images	2025-12-26 11:24:13 +11:00
Dhanji R. Prasanna	3601cc0547	Enhance read_image tool with magic byte detection and multi-image support - Fix media type detection using magic bytes instead of file extension - Correctly identifies JPEG files with .png extension (and vice versa) - Supports PNG, JPEG, GIF, and WebP formats - Add multi-image support with file_paths array parameter - Load multiple images in a single tool call - All images queued for LLM analysis - Enhanced CLI output: - Inline image preview via iTerm2 imgcat protocol (height=5) - Dimmed info line showing: path \| dimensions \| media type \| file size - Proper │ prefix alignment with tool output boxing - Human-readable file sizes (bytes, KB, MB) - Add image dimension extraction from file headers - PNG, JPEG, GIF, WebP dimension parsing - Add comprehensive tests for magic byte detection and dimensions	2025-12-26 11:19:37 +11:00
Dhanji R. Prasanna	3ece02ff31	fix: resolve compiler warnings across crates - Remove unused assignment to final_output_called (returns immediately after) - Mark cache_config field as #[allow(dead_code)] (reserved for future use) - Mark print_status_line method as #[allow(dead_code)] (reserved for future use)	2025-12-25 18:47:22 +11:00
Dhanji R. Prasanna	258f9878ff	style: use ◉ symbol for token count in timing footer Changes '227tk \| 48% ctx' to '227 ◉ \| 48%' for a cleaner look.	2025-12-25 18:40:17 +11:00
Dhanji R. Prasanna	d09c80180e	fix: remove redundant TODO list header that breaks boxing effect	2025-12-25 18:34:51 +11:00
Dhanji R. Prasanna	64f27c0abc	feat: move TODO lists to session-scoped directories TODO lists are now stored in .g3/sessions/<session_id>/todo.g3.md instead of the workspace root. This prevents different g3 sessions from accidentally picking up or overwriting each other's TODOs. Changes: - Add get_session_todo_path() function in paths.rs - Update todo_read/todo_write handlers to use session-specific paths - Remove TODO loading at Agent initialization (sessions start fresh) - Update prompts to reflect session-scoped behavior Fallback behavior preserved for planner mode (G3_TODO_PATH env var).	2025-12-25 18:33:03 +11:00
Dhanji R. Prasanna	d9c58576a1	feat: add background_process tool for launching long-running processes Adds a new tool that allows launching processes (like game servers) in the background while g3 continues to operate. The process runs independently with stdout/stderr captured to a log file. Features: - Named process tracking for easy reference - Automatic log capture to logs/background_processes/ - Returns PID and log file path for use with shell tool - Automatic cleanup on agent shutdown via Drop trait Usage: Use shell tool to interact with the process: - Read logs: tail -100 <logfile> - Check status: ps -p <pid> - Stop process: kill <pid> Files: - New: crates/g3-core/src/background_process.rs - New: crates/g3-core/tests/background_process_demo_test.rs - Modified: crates/g3-core/src/lib.rs (tool definition + handler) - Modified: crates/g3-core/src/prompts.rs (documentation)	2025-12-25 18:23:10 +11:00
Dhanji R. Prasanna	9ff5ba6098	Fix auto-continue false positives from tool-call-like content When the LLM outputs text containing tool call patterns (e.g., reading log files, showing examples, or discussing tool calls), the parser's has_unexecuted_tool_call() would detect these as real tool calls and trigger auto-continue, leading to repeated empty responses. The fix: mark the parser buffer as consumed when content is displayed. This prevents tool-call-like patterns in displayed text from triggering false positives later. The fix is safe because: 1. Only runs when no tool was detected (inside 'if !tool_executed') 2. Legitimate tool calls are detected first by process_chunk() 3. Matches existing pattern of calling mark_tool_calls_consumed() after tool execution	2025-12-25 17:55:13 +11:00
Dhanji R. Prasanna	f9d0c33461	Revert "Fix auto-continue bug: ensure assistant message before continue prompt" This reverts commit `fe96969adb`.	2025-12-24 15:52:23 +11:00
Dhanji R. Prasanna	fe96969adb	Fix auto-continue bug: ensure assistant message before continue prompt The auto-continue logic was adding User continue prompts without first adding an Assistant message when the LLM returned an empty response. This caused consecutive User messages in the conversation history, which confused the LLM and caused it to return more empty responses. The fix ensures an Assistant message is always added before the continue prompt, using '[empty response]' as a placeholder when the LLM returned nothing substantive. This maintains proper User/Assistant alternation.	2025-12-24 15:50:30 +11:00
Dhanji R. Prasanna	cd64ebbf87	Add tokens consumed and context percentage to per-tool timing footer The per-tool timing line now shows: - Tokens delta (tokens added to context by this tool call) - Context window usage percentage Example: └─ ⚡️ 1ms 523tk \| 49% ctx Changes: - Updated UiWriter trait print_tool_timing signature - Track tokens before/after adding tool messages to calculate delta - Updated ConsoleUiWriter, MachineUiWriter, PlannerUiWriter, and test mocks	2025-12-24 15:44:19 +11:00
Dhanji R. Prasanna	fd22ce9890	refactor(g3-core): extract 4 modules from monolithic lib.rs Reduce lib.rs from 7481 to 6557 lines (-12.4%) by extracting: - paths.rs: Session/workspace path utilities (get_todo_path, get_logs_dir, etc.) - streaming_parser.rs: StreamingToolParser for LLM response parsing - utils.rs: Diff parsing and shell escaping utilities - webdriver_session.rs: Unified Safari/Chrome WebDriver abstraction All public APIs preserved via re-exports for backward compatibility. Added 13 new unit tests across extracted modules. All 225 tests pass.	2025-12-24 14:32:39 +11:00
Dhanji R. Prasanna	382b905441	duplicate output fix	2025-12-23 17:20:23 +11:00
Dhanji R. Prasanna	ed246ce434	consolidate .g3/session -> .g3/sessions/*	2025-12-23 16:22:12 +11:00
Dhanji R. Prasanna	0b023b610f	Update README with recent improvements - Added section on Tool Call Duplicate Detection explaining the sequential-only duplicate prevention logic - Added section on Timing Footer showing token usage and context % - Updated Logging note to mention INFO->DEBUG conversion for cleaner CLI	2025-12-22 17:32:39 +11:00
Dhanji R. Prasanna	743d622468	Add token usage and context % to timing footer Added a quality-of-life feature that displays: - Tokens used in the current turn (from LLM response, not estimated) - Current context window usage percentage These are displayed dimmed after the timing info: ⏱️ 1.2s \| 💭 0.3s 1234tk \| 45% ctx The token count comes directly from the LLM's usage response data, not from any estimation. If no usage data is available from the LLM, only the context percentage is shown.	2025-12-22 17:22:54 +11:00
Dhanji R. Prasanna	720ad8cad7	Merge branch 'dhanji/fix-auto-continue': Fix auto-continue and duplicate detection bugs	2025-12-22 17:12:24 +11:00
Dhanji R. Prasanna	10e2fe9b94	Add tests for duplicate detection logic Added 13 tests to verify that duplicate detection only catches IMMEDIATELY SEQUENTIAL duplicates: - test_find_complete_json_object_end_* - Tests for JSON parsing helper - test_same_tool_with_text_between_not_duplicate - Key test ensuring tool calls separated by text are NOT duplicates - test_different_tools_back_to_back_not_duplicate - test_same_tool_different_args_not_duplicate - test_identical_tool_calls_back_to_back_are_duplicates - test_has_text_after_tool_call - Tests text detection logic - test_tool_call_with_newlines_between - test_tool_call_with_whitespace_text_between - test_tool_call_in_middle_of_text - test_multiple_different_tool_calls_with_text Also made find_complete_json_object_end public for testing.	2025-12-22 17:11:05 +11:00
Dhanji R. Prasanna	c7204c6699	Fix tool call detection and duplicate handling issues 1. Set tool_executed=true when a tool call is detected, even if skipped as a duplicate. This prevents the raw JSON from being printed to screen when a tool call is detected but not executed. 2. Remove session-level duplicate detection entirely. All tools should be allowed to be called multiple times in a session. 3. Fix sequential duplicate detection to only catch IMMEDIATELY sequential duplicates: - DUP IN CHUNK: Now only checks if the PREVIOUS tool call in the chunk is the same (not any tool call in the chunk) - DUP IN MSG: Now only checks if the LAST tool call in the previous message matches AND there's no text after it. If there's any non-whitespace text between tool calls, they're not considered duplicates. This allows legitimate re-use of tools while still catching cases where the LLM stutters and outputs the same tool call twice in a row.	2025-12-22 17:03:07 +11:00
Dhanji R. Prasanna	da91459e09	Fix auto-continue bug: don't return early when tools executed but final_output not called The bug was in the chunk.finished block inside stream_completion_with_tools. When no tool was executed in the CURRENT iteration (!tool_executed), the code would return early without checking if tools were executed in PREVIOUS iterations (any_tool_executed) and final_output was never called. This caused the agent to terminate prematurely after executing tools like todo_read when the LLM responded with text instead of calling final_output. The fix adds a check: if any_tool_executed && !final_output_called, we break to let the outer loop's auto-continue logic prompt the LLM to continue. Also fixed missing debug! import in g3-console/src/main.rs.	2025-12-22 16:45:17 +11:00
Dhanji R. Prasanna	923def0ab2	Convert all INFO logs to DEBUG to reduce CLI noise Converted ~77 info! macro calls to debug! across the codebase to prevent log messages from interrupting the CLI experience during normal operation. Users can still see these logs by setting RUST_LOG=debug if needed. Affected crates: - g3-cli - g3-computer-control - g3-console - g3-core - g3-ensembles - g3-execution - g3-providers	2025-12-22 16:27:35 +11:00
Dhanji R. Prasanna	58cbf3431a	Fix auto-continue bug: don't mark tool calls consumed prematurely The bug: When the LLM emitted multiple tool calls in one response (e.g., str_replace followed by shell), only the first tool was executed. The remaining tools were lost because mark_tool_calls_consumed() was called BEFORE processing, marking ALL tools as consumed even when only ONE was being processed. This caused has_unexecuted_tool_call() to return false after executing the first tool, so the parser was reset and the remaining tool calls were discarded. The auto-continue logic never triggered because it thought all tools had been handled. The fix: Remove the premature mark_tool_calls_consumed() call. The existing logic at line 4696-4699 already handles marking tools as consumed AFTER execution, and correctly checks for remaining unexecuted tools before deciding whether to reset the parser.	2025-12-22 16:24:11 +11:00
Dhanji R. Prasanna	3a07a02b02	Add comprehensive tests for StreamingToolParser Tests cover: - Multiple tool calls in one response (single chunk and across chunks) - Tool call followed by text (before, after, and both) - Incomplete tool calls at various truncation points - Parser reset behavior (buffer, incomplete state, unexecuted state) - Buffer management and edge cases (streaming accumulation, empty chunks) - JSON edge cases (escaped quotes, backslashes, nested braces) - Tool call pattern variations (spacing, newlines) - mark_tool_calls_consumed() functionality - Duplicate tool call detection - Multiple tool calls returned on stream finish - has_message_like_keys validation	2025-12-22 16:10:34 +11:00
Dhanji R. Prasanna	8070147a0c	Fix multiple tool call handling and improve auto-continue logic - Add last_consumed_position tracking to StreamingToolParser to prevent re-detecting already-executed tool calls - Add mark_tool_calls_consumed() method to mark tool calls as processed - Add find_first_tool_call_start() for forward scanning of tool patterns - Replace try_parse_json_tool_call_from_buffer() with try_parse_all_json_tool_calls_from_buffer() to find ALL tool calls - Update has_incomplete_tool_call() and has_unexecuted_tool_call() to only check unconsumed portion of buffer - Fix tool execution loop to not reset parser when unexecuted tools remain - Simplify should_auto_continue logic (remove redundant condition) - Add comprehensive tests for auto-continue condition logic	2025-12-22 16:08:57 +11:00
Dhanji R. Prasanna	a755301cf9	attempt 2	2025-12-22 15:33:23 +11:00
Dhanji R. Prasanna	0e4febc3fb	attempted fix of autocontinue	2025-12-22 15:01:27 +11:00

... 4 5 6 7 8 ...

646 Commits