alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	d96d8c1d90	Rewrite JSON tool call filter with clean state machine Fixes bug where JSON tool calls were printed as text due to chunking issues. Changes: - Complete rewrite of filter_json.rs with 3-state machine: - Streaming: normal pass-through, watches for newline + whitespace + { - Buffering: confirms/denies tool pattern with ~20 char buffer - Suppressing: string-aware brace counting until balanced - Character-by-character processing eliminates chunk boundary issues - Proper handling of } inside JSON strings (was causing premature exit) - Detects truncated JSON followed by complete JSON (LLM retry case) - Removed regex dependency, simpler pattern matching - Added 59 stress tests covering malformed JSON, partial patterns, streaming edge cases, adversarial inputs, and real-world patterns All 86 filter_json tests pass.	2026-01-09 14:05:11 +11:00
Dhanji R. Prasanna	49b27b0cbc	fix: truncate long lines in streaming tool output to prevent terminal wrapping When shell commands output very long lines (e.g., JSON content from tail -c 10000), the lines would wrap in the terminal. The cursor-up escape code (\x1b[1A) only moves up one visual line, not the entire wrapped content, causing the display to fill with uncleared text. This fix truncates lines to 120 characters in update_tool_output_line() before displaying them, preventing the wrapping issue.	2026-01-09 13:35:58 +11:00
Dhanji R. Prasanna	67be0f20c7	fix: remove allow_multiple_tool_calls config and simplify tool execution flow This fixes a bug where the agent would stop responding abruptly without calling final_output. The root cause was the allow_multiple_tool_calls config option (default: false) which caused the agent to break out of the streaming loop mid-stream after executing the first tool, losing any subsequent content. Changes: - Remove allow_multiple_tool_calls config option entirely - Always process all tool calls without breaking mid-stream - Simplify system prompt generation (no longer needs boolean param) - Let the stream complete fully before continuing to next iteration - Change find_last_tool_call_start to find_first_tool_call_start - Remove parser.reset() call on duplicate detection Benefits: - Simpler logic with less conditional branching - No lost content after tool calls - Consistent behavior for all users - Reduced config complexity	2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna	a72d5a650a	Fix two markdown formatting bugs Bug 1: Inline code after list bullets not detected - After emitting a list bullet, at_line_start was not set to false - This caused the next backtick to be treated as a potential code fence - Fixed by setting at_line_start = false after emitting bullet Bug 2: Code block closing on indented backticks - Code blocks containing indented ``` (4+ spaces) were closing prematurely - The .trim() check was too permissive - Fixed by only allowing closing fence with <= 3 spaces indent (CommonMark spec) Added tests for both edge cases.	2026-01-08 20:50:26 +11:00
Dhanji R. Prasanna	19a804e0be	Add syntax highlighting for Racket, Elisp, and Scheme Add language alias mapping in highlight_code() to map: - racket, rkt -> lisp - elisp, emacs-lisp -> lisp - scheme -> lisp - common-lisp, cl -> lisp - shell, sh, zsh, dockerfile -> bash Syntect's built-in Lisp syntax handles all Lisp-family languages well. Added test to verify the aliases work correctly.	2026-01-08 20:35:34 +11:00
Dhanji R. Prasanna	df706308ca	Unify final_output rendering with streaming markdown formatter Replace the separate syntax_highlight module with the streaming markdown formatter for final_output rendering. This: - Removes special buffered rendering logic for final_output - Uses the same StreamingMarkdownFormatter used for agent responses - Removes the spinner animation (content renders immediately) - Deletes the now-unused syntax_highlight.rs module - Updates test to use the streaming formatter Benefits: - Consistent rendering across all markdown output - Less code to maintain (removed ~250 lines) - Same syntax highlighting via syntect (already in streaming formatter)	2026-01-08 20:30:44 +11:00
Dhanji R. Prasanna	347513b04c	Add comprehensive stress tests for streaming markdown formatter Add 10 stress tests covering: - Nested formatting (bold in italic, italic in bold) - Empty/minimal content edge cases - Escape sequences and special characters - Lists with complex inline formatting - Links with various content types - Tables with formatting in cells - Code blocks (should not format contents) - Mixed block elements (headers, quotes, rules) - Nested lists (3+ levels, mixed types) - Pathological/adversarial inputs (unbalanced delimiters, unicode, long lines) All 45 tests pass.	2026-01-08 20:27:28 +11:00
Dhanji R. Prasanna	5bfaee8dd5	use consistent naming for compaction	2026-01-08 12:54:03 +11:00
Dhanji R. Prasanna	4e7aca50fa	feat: royal blue tool names in agent mode + fix README heading display - Add set_agent_mode() to UiWriter trait for visual mode differentiation - ConsoleUiWriter uses royal blue (ANSI 256 color 69) for tool names in agent mode - Fix extract_readme_heading() to search only README section of combined content (was incorrectly showing AGENTS.md heading instead of README heading)	2026-01-07 11:37:51 +11:00
Dhanji R. Prasanna	1056b4193b	chore(g3-cli): remove orphaned retro_tui and tui modules These files were not referenced anywhere in the codebase and appear to be leftover from a previous TUI implementation that was abandoned. Removed: - crates/g3-cli/src/retro_tui.rs (62KB) - crates/g3-cli/src/tui.rs (6KB) Agent: fowler	2026-01-07 10:39:42 +11:00
Dhanji R. Prasanna	c4ae85de72	Add --new-session flag to skip session resumption in agent mode Adds a new CLI flag that allows users to force a new session when running in agent mode, bypassing the automatic detection and resumption of incomplete sessions. Usage: g3 --agent my-agent --new-session	2026-01-07 09:59:15 +11:00
Dhanji R. Prasanna	5d20da2609	Add 54 integration tests for CLI, tools, and message serialization New test files: - crates/g3-cli/tests/cli_integration_test.rs (14 tests) Blackbox CLI tests: help/version flags, argument validation, conflicting modes, flock mode requirements - crates/g3-core/tests/tool_execution_test.rs (20 tests) Tool call structure tests and unified diff application: read_file, write_file, str_replace, shell, background_process, todo, final_output, code_search, take_screenshot - crates/g3-providers/tests/message_serialization_test.rs (20 tests) Round-trip serialization tests for Message, MessageRole, CacheControl, and Tool types. Covers Unicode, special chars, and edge cases. All tests follow blackbox/integration-first principles with documentation of what they protect and intentionally do not assert.	2026-01-07 09:23:34 +11:00
Dhanji R. Prasanna	386176899e	Remove vision tools (except take_screenshot) and macax tools Vision tools removed: - extract_text (OCR from image files) - extract_text_with_boxes (OCR with bounding boxes) - vision_find_text (find text in app windows) - vision_click_text (find and click on text) - vision_click_near_text (click near text labels) macax tools removed: - macax_list_apps - macax_get_frontmost_app - macax_activate_app - macax_press_key - macax_type_text The LLM can now read images directly via read_image tool. take_screenshot is retained for capturing application windows. Files deleted: - crates/g3-core/src/tools/vision.rs - crates/g3-core/src/tools/macax.rs - docs/macax-tools.md Updated tool counts: 12 core + 15 webdriver = 27 total	2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna	76bfb77f84	further fowler fixes and session fixes	2026-01-03 15:47:04 +11:00
Dhanji R. Prasanna	595ad6ad21	agent mode resumption	2026-01-03 14:50:08 +11:00
Dhanji R. Prasanna	8d071d5eed	fix: fowler agent now respects --workspace flag and reads project docs - Fixed run_agent_mode to call std::env::set_current_dir with workspace_dir - Updated fowler.md to read README.md and AGENTS.md as part of Triage & Understanding step	2025-12-26 15:24:20 +11:00
Dhanji R. Prasanna	7e59e181f7	context line ui	2025-12-26 12:58:13 +11:00
Dhanji R. Prasanna	258f9878ff	style: use ◉ symbol for token count in timing footer Changes '227tk \| 48% ctx' to '227 ◉ \| 48%' for a cleaner look.	2025-12-25 18:40:17 +11:00
Dhanji R. Prasanna	cd64ebbf87	Add tokens consumed and context percentage to per-tool timing footer The per-tool timing line now shows: - Tokens delta (tokens added to context by this tool call) - Context window usage percentage Example: └─ ⚡️ 1ms 523tk \| 49% ctx Changes: - Updated UiWriter trait print_tool_timing signature - Track tokens before/after adding tool messages to calculate delta - Updated ConsoleUiWriter, MachineUiWriter, PlannerUiWriter, and test mocks	2025-12-24 15:44:19 +11:00
Dhanji R. Prasanna	923def0ab2	Convert all INFO logs to DEBUG to reduce CLI noise Converted ~77 info! macro calls to debug! across the codebase to prevent log messages from interrupting the CLI experience during normal operation. Users can still see these logs by setting RUST_LOG=debug if needed. Affected crates: - g3-cli - g3-computer-control - g3-console - g3-core - g3-ensembles - g3-execution - g3-providers	2025-12-22 16:27:35 +11:00
Dhanji R. Prasanna	38fcaaf449	Add edge case tests for filter_json_tool_calls - test_brace_inside_json_string_value: braces inside JSON strings - test_multiple_braces_in_string: multiple braces in string values - test_escaped_quotes_with_braces: escaped quotes with braces - test_brace_in_string_across_chunks: streaming with braces in strings - test_complex_nested_with_string_braces: nested JSON with string braces - test_str_replace_with_diff_content: real-world str_replace case - test_tool_call_after_other_content: tool call after other output - test_tool_call_with_nested_tool_pattern_in_string: nested patterns All 27 tests pass.	2025-12-22 13:30:57 +11:00
Dhanji R. Prasanna	3bc254962c	clean up filter_json a bit (more to come)	2025-12-22 12:03:09 +11:00
Dhanji R. Prasanna	01a5284d6d	Move fixed_filter_json from g3-core to g3-cli Properly separates UI display concern from core library: - fixed_filter_json module now lives in g3-cli (UI layer) - UiWriter trait gains filter_json_tool_calls() and reset_json_filter() methods - g3-core delegates filtering to UI layer via trait methods - Different UiWriter implementations can choose their own filtering behavior - ConsoleUiWriter filters JSON tool calls for clean terminal display - MachineUiWriter/NullUiWriter use default pass-through Benefits: - Proper separation of concerns - Core stays clean without display-specific logic - Testability - filter can be tested independently in g3-cli	2025-12-22 10:32:21 +11:00
Dhanji R. Prasanna	fbf31e5f68	Fix continuation errors: auto-continue when final_output not called - Add final_output_called flag to track if LLM properly completed - Auto-continue with prompt if tools executed but final_output missing - Remove unused last_action_was_tool and any_text_response variables - Simplifies previous complex incomplete response detection logic	2025-12-20 15:32:12 +11:00
Dhanji R. Prasanna	e771382bd0	agent mode + fowler bot	2025-12-19 16:14:03 +11:00
Dhanji R. Prasanna	faa6512b1f	Revert to Safari as default WebDriver browser Chrome headless has too many issues: - Session creation hangs when Chrome is already running - Cloudflare and other bot protection blocks headless browsers - Version mismatch issues between Chrome and ChromeDriver Safari is more reliable for web automation on macOS. Chrome headless is still available via --chrome-headless flag.	2025-12-16 12:36:18 +11:00
Dhanji R. Prasanna	3d1b86d24b	Make Chrome headless the default WebDriver browser - Add --safari flag to CLI for explicitly choosing Safari - Update --chrome-headless flag description to indicate it's the default - Update README to reflect Chrome headless as default - Remove broken link to non-existent docs/webdriver-setup.md - Add Safari flag handling in all webdriver config locations The config already had ChromeHeadless as the default, this commit updates the CLI and documentation to match.	2025-12-15 16:51:42 +11:00
Jochen	87bceba54f	Fix planner UI whitespace and workspace logs directory Resolve two critical issues in planner mode that persisted through multiple fix attempts: 1. Remove excessive whitespace between tool call displays by replacing direct println!() calls with ui_writer methods and eliminating redundant newlines in agent response streaming. 2. Ensure all log files (errors, sessions, tool calls, context dumps) are written to <workspace>/logs instead of codepath by properly initializing G3_WORKSPACE_PATH from --workspace argument.	2025-12-10 16:18:49 +11:00
Jochen	75aa2d983e	Refine planner mode UI and error handling Improve planner mode user experience with better error reporting, cleaner tool output, and consistent log file placement. - Propagate and display classified LLM errors to users with appropriate icons and context - Display tool calls on single lines with truncated arguments - Show LLM text responses without overwriting via UiWriter - Ensure all logs write to workspace/logs directory consistently - Set G3_WORKSPACE_PATH early in planning mode initialization	2025-12-09 22:44:00 +11:00
Jochen	ff8b3e7c7b	Implement planning mode	2025-12-09 17:03:53 +11:00
Dhanji R. Prasanna	678403da35	add a force thinnify cmd	2025-12-05 15:32:13 +11:00
Jochen	0327a6dfdf	make sure coach feedback is extracted.	2025-12-02 22:00:58 +11:00
Jochen	928f2bfa9d	actually record coach feedback and use it	2025-12-02 21:23:50 +11:00
Dhanji R. Prasanna	d9ad244197	add markdown format only to final_output and fix todo duplication	2025-12-02 14:26:22 +11:00
Dhanji R. Prasanna	0e4c935a70	clean up TODO output	2025-12-02 06:48:58 +11:00
Jochen	6dcae1e3f4	fix use import	2025-11-28 10:21:06 +11:00
Jochen	0d504d6422	temporarily disable codebase_fast_start it seems the llm gets "lazy" and assumes all the tool calls meant it's done most of the work. I need to revise this approach.	2025-11-27 21:02:01 +11:00
Jochen	52f78653b4	add context window monitor Writes the current context window to logs/current_context_window (uses a symlink to a session ID). This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.	2025-11-27 21:00:02 +11:00
Jochen	7e1ce36a4b	Merge pull request #35 from dhanji/jochen_write_existing_file remove check for whether a file exists in the workspace	2025-11-27 13:44:45 +11:00
Jochen	9f6592efc2	remove redundant 'if'	2025-11-27 13:34:54 +11:00
Jochen	99125fc39e	completely remove the skipping first player logic	2025-11-27 13:21:40 +11:00
Jochen	c58aa80932	explain what file was found in workspace	2025-11-26 21:43:59 +11:00
Dhanji Prasanna	4cfa0147ca	first cut of horizontal partitioning # Conflicts: # Cargo.lock # Conflicts: # Cargo.lock # crates/g3-cli/src/lib.rs	2025-11-26 17:12:07 +11:00
Jochen	c19127f809	make sure user requirements are included	2025-11-26 10:26:52 +11:00
Jochen	2e252cd298	added timer	2025-11-25 22:51:33 +11:00
Jochen	ad198a8501	add code exploration fast start This tries to short-circuit multiple round-trips to llm for reading code. It's a precursor to trying to context engineer tailored to specific tasks. In initial experiments, it's only marginally faster than regular mode, and burns more tokens.	2025-11-25 22:51:32 +11:00
Jochen	551a577ee1	changed user choice for TODO stale check user can ignore, mark stale or quit.	2025-11-21 12:35:14 +11:00
Jochen	28a83d2dcf	check for stale TODOs on by default, can be disabled	2025-11-21 12:09:01 +11:00
Jochen	1069664e16	fix bad max_tokens and context_window logic for non-databricks code	2025-11-19 13:51:16 +11:00
Dhanji Prasanna	aaf918828f	g3 console initial cut + error doesnt kill auto	2025-11-07 09:27:13 +11:00

1 2 3 4 5 ...

273 Commits