alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	bebf04c7bd	Tighten system prompt	2026-01-09 14:11:19 +11:00
Dhanji R. Prasanna	d96d8c1d90	Rewrite JSON tool call filter with clean state machine Fixes bug where JSON tool calls were printed as text due to chunking issues. Changes: - Complete rewrite of filter_json.rs with 3-state machine: - Streaming: normal pass-through, watches for newline + whitespace + { - Buffering: confirms/denies tool pattern with ~20 char buffer - Suppressing: string-aware brace counting until balanced - Character-by-character processing eliminates chunk boundary issues - Proper handling of } inside JSON strings (was causing premature exit) - Detects truncated JSON followed by complete JSON (LLM retry case) - Removed regex dependency, simpler pattern matching - Added 59 stress tests covering malformed JSON, partial patterns, streaming edge cases, adversarial inputs, and real-world patterns All 86 filter_json tests pass.	2026-01-09 14:05:11 +11:00
Dhanji R. Prasanna	49b27b0cbc	fix: truncate long lines in streaming tool output to prevent terminal wrapping When shell commands output very long lines (e.g., JSON content from tail -c 10000), the lines would wrap in the terminal. The cursor-up escape code (\x1b[1A) only moves up one visual line, not the entire wrapped content, causing the display to fill with uncleared text. This fix truncates lines to 120 characters in update_tool_output_line() before displaying them, preventing the wrapping issue.	2026-01-09 13:35:58 +11:00
Dhanji R. Prasanna	67be0f20c7	fix: remove allow_multiple_tool_calls config and simplify tool execution flow This fixes a bug where the agent would stop responding abruptly without calling final_output. The root cause was the allow_multiple_tool_calls config option (default: false) which caused the agent to break out of the streaming loop mid-stream after executing the first tool, losing any subsequent content. Changes: - Remove allow_multiple_tool_calls config option entirely - Always process all tool calls without breaking mid-stream - Simplify system prompt generation (no longer needs boolean param) - Let the stream complete fully before continuing to next iteration - Change find_last_tool_call_start to find_first_tool_call_start - Remove parser.reset() call on duplicate detection Benefits: - Simpler logic with less conditional branching - No lost content after tool calls - Consistent behavior for all users - Reduced config complexity	2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna	a72d5a650a	Fix two markdown formatting bugs Bug 1: Inline code after list bullets not detected - After emitting a list bullet, at_line_start was not set to false - This caused the next backtick to be treated as a potential code fence - Fixed by setting at_line_start = false after emitting bullet Bug 2: Code block closing on indented backticks - Code blocks containing indented ``` (4+ spaces) were closing prematurely - The .trim() check was too permissive - Fixed by only allowing closing fence with <= 3 spaces indent (CommonMark spec) Added tests for both edge cases.	2026-01-08 20:50:26 +11:00
Dhanji R. Prasanna	19a804e0be	Add syntax highlighting for Racket, Elisp, and Scheme Add language alias mapping in highlight_code() to map: - racket, rkt -> lisp - elisp, emacs-lisp -> lisp - scheme -> lisp - common-lisp, cl -> lisp - shell, sh, zsh, dockerfile -> bash Syntect's built-in Lisp syntax handles all Lisp-family languages well. Added test to verify the aliases work correctly.	2026-01-08 20:35:34 +11:00
Dhanji R. Prasanna	df706308ca	Unify final_output rendering with streaming markdown formatter Replace the separate syntax_highlight module with the streaming markdown formatter for final_output rendering. This: - Removes special buffered rendering logic for final_output - Uses the same StreamingMarkdownFormatter used for agent responses - Removes the spinner animation (content renders immediately) - Deletes the now-unused syntax_highlight.rs module - Updates test to use the streaming formatter Benefits: - Consistent rendering across all markdown output - Less code to maintain (removed ~250 lines) - Same syntax highlighting via syntect (already in streaming formatter)	2026-01-08 20:30:44 +11:00
Dhanji R. Prasanna	347513b04c	Add comprehensive stress tests for streaming markdown formatter Add 10 stress tests covering: - Nested formatting (bold in italic, italic in bold) - Empty/minimal content edge cases - Escape sequences and special characters - Lists with complex inline formatting - Links with various content types - Tables with formatting in cells - Code blocks (should not format contents) - Mixed block elements (headers, quotes, rules) - Nested lists (3+ levels, mixed types) - Pathological/adversarial inputs (unbalanced delimiters, unicode, long lines) All 45 tests pass.	2026-01-08 20:27:28 +11:00
Dhanji R. Prasanna	fadfaee040	update gitingore	2026-01-08 13:50:03 +11:00
Dhanji R. Prasanna	381b852869	refactor(g3-core): Extract streaming utilities into dedicated module Extract reusable utilities from the massive stream_completion_with_tools function into a new streaming.rs module for improved readability: - format_duration, format_timing_footer: timing display helpers - clean_llm_tokens: consolidates 4 duplicate token-cleaning call sites - log_stream_error: extracts 70+ lines of error logging - is_empty_response, is_connection_error: predicate helpers - truncate_for_display, truncate_line: string truncation utilities - StreamingState, IterationState: state structs for future refactoring Results: - lib.rs reduced from 2978 to 2840 lines (138 lines, ~5%) - New streaming.rs: 309 lines with 5 unit tests - All 98+ tests pass Agent: carmack	2026-01-08 13:20:11 +11:00
Dhanji R. Prasanna	267ef00848	refactor: extract session helper in webdriver.rs to reduce boilerplate Agent: carmack Add get_session() helper function that: - Checks if webdriver is enabled - Acquires the session read lock - Returns the cloned session or an error message Refactored 12 webdriver tool functions to use this helper: - execute_webdriver_navigate - execute_webdriver_get_url - execute_webdriver_get_title - execute_webdriver_find_element - execute_webdriver_find_elements - execute_webdriver_click - execute_webdriver_send_keys - execute_webdriver_execute_script - execute_webdriver_get_page_source - execute_webdriver_screenshot - execute_webdriver_back - execute_webdriver_forward - execute_webdriver_refresh Each function previously had ~10 lines of identical boilerplate. Now reduced to 4 lines using the helper. Net reduction: 68 lines (678 -> 610) All tests pass. Behavior unchanged.	2026-01-08 13:05:44 +11:00
Dhanji R. Prasanna	5bfaee8dd5	use consistent naming for compaction	2026-01-08 12:54:03 +11:00
Dhanji R. Prasanna	3776ed847e	refactor: use shared streaming helpers in openai and embedded providers Agent: carmack openai.rs: - Use make_text_chunk() for streaming text content - Use make_final_chunk() for final completion chunk - Simplify tool_calls conversion logic embedded.rs: - Use make_text_chunk() for all 4 streaming text chunks - Use make_final_chunk() for final completion chunk - Remove unused CompletionChunk import Net reduction: 35 lines removed All tests pass. Behavior unchanged.	2026-01-07 13:01:03 +11:00
Dhanji R. Prasanna	2bf475960c	refactor: extract shared streaming utilities module Agent: carmack Create crates/g3-providers/src/streaming.rs with shared helpers: - decode_utf8_streaming(): Handle incomplete UTF-8 sequences in SSE streams - is_incomplete_json_error(): Detect incomplete vs malformed JSON - make_final_chunk(): Create finished completion chunks - make_text_chunk(): Create text content chunks - make_tool_chunk(): Create tool call chunks Refactor anthropic.rs: - Use shared decode_utf8_streaming (removes 15 lines of inline UTF-8 handling) - Use make_final_chunk, make_text_chunk, make_tool_chunk helpers - Reduces verbose CompletionChunk constructions throughout Refactor databricks.rs: - Remove local copies of streaming helpers (now uses shared module) - Reduces duplication between providers Net reduction: 118 lines removed, 16 lines added (including new module) All tests pass. Behavior unchanged.	2026-01-07 12:48:07 +11:00
Dhanji R. Prasanna	bb63050779	refactor: improve readability of streaming and file ops code Agent: carmack databricks.rs: - Extract ToolCallAccumulator struct to replace opaque (String, String, String) tuple - Add decode_utf8_streaming() helper for cleaner UTF-8 handling - Add is_incomplete_json_error() helper for JSON parse error detection - Add make_final_chunk() helper to reduce duplication - Add finalize_tool_calls() to convert accumulators to final format - Refactor parse_streaming_response from ~270 lines to ~100 lines - Reduce nesting depth from 8+ levels to 4 levels - Use early returns and let-else for cleaner control flow file_ops.rs: - Replace repetitive if-let chains with declarative PATH_CONTENT_KEYS table - Use match expression instead of nested if-else - Reduce extract_path_and_content from 44 lines to 20 lines All tests pass. Behavior unchanged.	2026-01-07 12:39:05 +11:00
Dhanji R. Prasanna	532ed132f7	Few shot prompts for carmack	2026-01-07 12:33:11 +11:00
Dhanji R. Prasanna	4e7aca50fa	feat: royal blue tool names in agent mode + fix README heading display - Add set_agent_mode() to UiWriter trait for visual mode differentiation - ConsoleUiWriter uses royal blue (ANSI 256 color 69) for tool names in agent mode - Fix extract_readme_heading() to search only README section of combined content (was incorrectly showing AGENTS.md heading instead of README heading)	2026-01-07 11:37:51 +11:00
Dhanji R. Prasanna	189fdec006	Carmack agent	2026-01-07 11:18:27 +11:00
Dhanji R. Prasanna	1980e62511	Improve code readability in g3-core - streaming_parser.rs: Rename has_message_like_keys to args_contain_prose_fragments with improved documentation explaining the heuristic for detecting malformed tool calls where LLM prose leaked into JSON keys - context_window.rs: Simplify build_thin_result_message using early return pattern and match expression for cleaner control flow Agent: carmack	2026-01-07 11:16:42 +11:00
Dhanji R. Prasanna	2e9535974d	removed testing craft	2026-01-07 10:46:37 +11:00
Dhanji R. Prasanna	775bcd10a5	chore: remove g3-console crate entirely The g3-console crate was not referenced by any other crate in the workspace and appears to be an abandoned web console implementation. Removed: - crates/g3-console/ (entire directory) - Workspace member entry in Cargo.toml Agent: fowler	2026-01-07 10:41:46 +11:00
Dhanji R. Prasanna	1056b4193b	chore(g3-cli): remove orphaned retro_tui and tui modules These files were not referenced anywhere in the codebase and appear to be leftover from a previous TUI implementation that was abandoned. Removed: - crates/g3-cli/src/retro_tui.rs (62KB) - crates/g3-cli/src/tui.rs (6KB) Agent: fowler	2026-01-07 10:39:42 +11:00
Dhanji R. Prasanna	48036d01e3	fix(g3-core): disable auto-continue in interactive mode Auto-continue was incorrectly triggering when the LLM asked questions in interactive/chat mode. Now auto-continue only activates when is_autonomous is true, allowing proper back-and-forth conversation in interactive mode. Agent: fowler	2026-01-07 10:37:30 +11:00
Dhanji R. Prasanna	a553764e93	docs(agents): add git authorship rule to all agent prompts Ensure agents never override git author/email and instead put their identity in the commit message body. Agent: fowler	2026-01-07 10:27:44 +11:00
Dhanji R. Prasanna	b73dfacb7a	refactor(g3-core): extract provider_registration and session modules Extract two focused modules from the monolithic lib.rs (3372 lines): 1. provider_registration.rs (233 lines) - Consolidates duplicated provider registration patterns - Single determine_providers_to_register() function for mode-based selection - Unified register_providers() async function for all provider types - Includes unit tests for registration logic 2. session.rs (394 lines) - Session ID generation (generate_session_id) - Context window persistence (save_context_window, write_context_window_summary) - Error logging (log_error_to_session) - Utility functions (format_token_count, token_indicator) - Session restoration helper (restore_from_session_log) - Includes comprehensive unit tests Also fixes: - Removed redundant tool_executed assignment that triggered unused warning - Removed unused Message import in session.rs Results: - lib.rs reduced from 3372 to 2976 lines (-396 lines, -11.7%) - All tests pass, no warnings - Behavior preserved (pure mechanical extraction) Agent: fowler	2026-01-07 10:20:28 +11:00
Dhanji R. Prasanna	c4ae85de72	Add --new-session flag to skip session resumption in agent mode Adds a new CLI flag that allows users to force a new session when running in agent mode, bypassing the automatic detection and resumption of incomplete sessions. Usage: g3 --agent my-agent --new-session	2026-01-07 09:59:15 +11:00
Dhanji R. Prasanna	f0bd7959b1	chore(analysis): update dependency analysis artifacts Authored by: Structural Analysis Agent (Euler) Updated all dependency analysis artifacts with fresh extraction: - graph.json: Canonical dependency graph with 10 crates, 139 files, 16 crate edges, 72 file edges - graph.summary.md: Overview with fan-in/fan-out rankings and crate inventory - sccs.md: SCC analysis confirming no cycles at crate or file level (clean DAG) - layers.observed.md: 5-layer architecture diagram derived from dependencies - hotspots.md: Coupling hotspots (g3-config highest fan-in, g3-cli highest fan-out) - limitations.md: Documented extraction limitations (conditional compilation, macros, etc.) Key findings: - All 10 workspace crates form a directed acyclic graph - g3-core/src/ui_writer.rs has highest file-level fan-in (10 dependents) - g3-console is standalone with no workspace dependencies - Clean layered architecture with no violations detected	2026-01-07 09:36:52 +11:00
Dhanji R. Prasanna	ff08a622eb	ask all agents to commit their work	2026-01-07 09:31:02 +11:00
Dhanji R. Prasanna	5d20da2609	Add 54 integration tests for CLI, tools, and message serialization New test files: - crates/g3-cli/tests/cli_integration_test.rs (14 tests) Blackbox CLI tests: help/version flags, argument validation, conflicting modes, flock mode requirements - crates/g3-core/tests/tool_execution_test.rs (20 tests) Tool call structure tests and unified diff application: read_file, write_file, str_replace, shell, background_process, todo, final_output, code_search, take_screenshot - crates/g3-providers/tests/message_serialization_test.rs (20 tests) Round-trip serialization tests for Message, MessageRole, CacheControl, and Tool types. Covers Unicode, special chars, and edge cases. All tests follow blackbox/integration-first principles with documentation of what they protect and intentionally do not assert.	2026-01-07 09:23:34 +11:00
Dhanji R. Prasanna	9cb6282719	update lamport	2026-01-07 09:07:29 +11:00
Dhanji R. Prasanna	311b3bd75a	added hopper testing agent and updated fowler to use euler	2026-01-07 09:06:46 +11:00
Dhanji R. Prasanna	e2445a5d22	refactor(g3-core): extract duplicate detection helper and consolidate thinning - Extract check_duplicate_in_previous_message() helper to reduce nesting from 6+ levels to 2 levels in stream_completion_with_tools - Create do_thin_context() and do_thin_context_all() helpers to centralize context thinning with event tracking - Use provider_config::parse_provider_ref() in additional call sites - All 295 tests pass This continues the refactoring to eliminate code-path aliasing and reduce cyclomatic complexity in the Agent implementation.	2026-01-07 08:45:51 +11:00
Dhanji R. Prasanna	a87928661d	Remove overly broad .json from .gitignore The blanket .json ignore is not canonical for Rust projects. JSON files that need ignoring are already covered by: - .g3/ for session logs - logs/ for error logs - .build for Swift build artifacts	2026-01-06 13:54:27 +11:00
Dhanji R. Prasanna	2d8e733820	Add dependency graph JSON data Add exception to .gitignore for analysis/deps/graph.json	2026-01-06 13:24:01 +11:00
Dhanji R. Prasanna	6d6aed563d	Add structural dependency analysis artifacts - graph.json: Canonical dependency graph (10 crates, 16 edges, 76 files) - graph.summary.md: One-page overview with fan-in/fan-out rankings - sccs.md: Strongly Connected Components analysis (no cycles) - layers.observed.md: 5-layer architecture diagram - hotspots.md: Coupling hotspots (g3-config, g3-cli) - limitations.md: Extraction limitations and validity conditions	2026-01-06 13:23:24 +11:00
Dhanji R. Prasanna	764d1bf67e	Add ./tmp/ to .gitignore	2026-01-06 12:50:14 +11:00
Dhanji R. Prasanna	2592fee5d5	Generalize lamport.md examples to be language-agnostic - Changed Rust-specific examples to generic ones: - 'Tool calls must be valid JSON' → 'API responses must be valid JSON' - 'Never block the async runtime' → 'Never block the event loop' - 'Crate/module' → 'Module/package' - 'run cargo test' → 'basic commands'	2026-01-06 12:49:00 +11:00
Dhanji R. Prasanna	e2fffaab94	Slim down AGENTS.md and update lamport.md for machine-specific output AGENTS.md changes: - Removed redundant sections that duplicated README.md: - System Overview (crate table) - File Structure Quick Reference - Testing Strategy - Pointers to Documentation - Architecture Decisions - Kept unique machine-specific sections: - Critical Invariants (merged Performance Constraints) - Recommended Entry Points - Dangerous/Subtle Code Paths - Do's and Don'ts for Automated Changes - Common Incorrect Assumptions - Dependency Analysis Artifacts - Reduced from ~220 lines to ~116 lines lamport.md changes: - Rewrote AGENTS.md section with explicit instructions - Added REQUIRED sections list (5 sections only) - Added DO NOT include list to prevent README duplication - AGENTS.md now points to README for architecture/usage	2026-01-06 12:46:40 +11:00
Dhanji R. Prasanna	6d2cab93f5	Extend euler.md to require AGENTS.md updates The Euler agent must now update AGENTS.md after generating artifacts: - Add/update 'Dependency Analysis Artifacts' section - Table listing each file in analysis/deps/ with one-line descriptions - No findings, metrics, or recommendations in AGENTS.md	2026-01-06 12:35:12 +11:00
Dhanji R. Prasanna	9132c441f1	Remove Key findings section from dependency analysis docs	2026-01-06 12:33:48 +11:00
Dhanji R. Prasanna	d695f10604	Document dependency analysis artifacts in AGENTS.md Added section explaining the analysis/deps/ directory contents: - graph.json: Raw dependency graph data - graph.summary.md: Overview metrics and rankings - sccs.md: Cycle detection results - layers.observed.md: Layer diagrams - hotspots.md: Coupling hotspots - limitations.md: Analysis limitations Includes key findings from the Euler agent's static analysis.	2026-01-06 12:31:17 +11:00
Dhanji R. Prasanna	386176899e	Remove vision tools (except take_screenshot) and macax tools Vision tools removed: - extract_text (OCR from image files) - extract_text_with_boxes (OCR with bounding boxes) - vision_find_text (find text in app windows) - vision_click_text (find and click on text) - vision_click_near_text (click near text labels) macax tools removed: - macax_list_apps - macax_get_frontmost_app - macax_activate_app - macax_press_key - macax_type_text The LLM can now read images directly via read_image tool. take_screenshot is retained for capturing application windows. Files deleted: - crates/g3-core/src/tools/vision.rs - crates/g3-core/src/tools/macax.rs - docs/macax-tools.md Updated tool counts: 12 core + 15 webdriver = 27 total	2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna	29e263ac49	Fix Unicode space handling in macOS screenshot filenames macOS uses U+202F (Narrow No-Break Space) in screenshot filenames between the time and am/pm. When users type or paste these paths, they use regular spaces, causing file-not-found errors. Changes: - Add resolve_path_with_unicode_fallback() to try U+202F variants - Add resolve_paths_in_shell_command() for shell command paths - Apply fix to read_file, read_image, and shell tools - Fix read_image prompt docs: file_path -> file_paths (array) - Add 6 unit tests for Unicode space normalization	2026-01-03 17:17:08 +11:00
Dhanji R. Prasanna	f7e2f38fe9	lamport run	2026-01-03 16:48:30 +11:00
Dhanji R. Prasanna	f4a1bf5e93	fix agent-mode session resumption bug	2026-01-03 16:44:58 +11:00
Dhanji R. Prasanna	76bfb77f84	further fowler fixes and session fixes	2026-01-03 15:47:04 +11:00
Dhanji R. Prasanna	65867e7f96	refactor tools out of lib.rs	2026-01-03 15:06:34 +11:00
Dhanji R. Prasanna	595ad6ad21	agent mode resumption	2026-01-03 14:50:08 +11:00
Dhanji R. Prasanna	016efc1db6	Prevent agent mode from stopping after first TODO phase - Add TODO completion check to final_output tool in autonomous mode only - When incomplete TODO items exist, reject final_output and prompt LLM to continue - Non-autonomous modes (interactive, chat) are unaffected - Add 6 tests verifying behavior in both autonomous and non-autonomous modes Fixes issue where LLM would call final_output after completing first phase, causing agent to stop prematurely instead of continuing with remaining phases.	2025-12-27 12:35:31 +11:00
Dhanji R. Prasanna	8d071d5eed	fix: fowler agent now respects --workspace flag and reads project docs - Fixed run_agent_mode to call std::env::set_current_dir with workspace_dir - Updated fowler.md to read README.md and AGENTS.md as part of Triage & Understanding step	2025-12-26 15:24:20 +11:00

1 2 3 4 5 ...

427 Commits