Commit Graph

137 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
08747595a1 Add fancy ASCII art header for agent mode
The agent mode header now shows:
- Agent name in uppercase with box art
- Working directory (truncated if too long)
- Status indicators for README, AGENTS.md, and Memory loading
- Task preview if provided

Also exports truncate_for_display and adds truncate_path_for_display
helper functions in project_files module.
2026-01-11 17:11:14 +05:30
Dhanji R. Prasanna
cf3727f50d refactor(g3-cli): Extract focused modules from lib.rs for improved readability
Extract three cohesive modules from the monolithic lib.rs (3188 -> 2785 lines):

- metrics.rs (147 lines): Turn metrics tracking and histogram generation
  - TurnMetrics struct
  - format_elapsed_time() for human-readable durations
  - generate_turn_histogram() for performance visualization
  - Added unit tests for core functions

- project_files.rs (181 lines): Project file reading utilities
  - read_agents_config() for AGENTS.md loading
  - read_project_readme() for README detection
  - read_project_memory() for .g3/memory.md
  - extract_readme_heading() for display
  - Added unit tests

- coach_feedback.rs (129 lines): Coach feedback extraction from session logs
  - extract_from_logs() main entry point
  - Helper functions for log parsing and text extraction

All modules have clear single responsibilities, improved documentation,
and maintain identical behavior to the original inline functions.

Agent: carmack
2026-01-11 16:41:41 +05:30
Dhanji R. Prasanna
9c71d12561 style: change agent mode tool color from royal blue to light gray 2026-01-11 16:26:20 +05:30
Dhanji R. Prasanna
74a18794a0 fix: load AGENTS.md and memory in agent mode
Agent mode was only loading README.md but not AGENTS.md or project
memory (.g3/memory.md). This meant agents were missing important
context that normal mode had access to.

Now agent mode uses the same read_agents_config(), read_project_readme(),
and read_project_memory() functions as normal mode, combining all three
into the agent context.
2026-01-11 16:15:58 +05:30
Dhanji R. Prasanna
1d884251cb refactor(cli): remove duplicate agent mode check in run()
The same if-let block checking for agent mode was duplicated,
causing dead code on the second check. Removed the duplicate.

Agent: fowler
2026-01-11 16:14:50 +05:30
Dhanji R. Prasanna
cfd5d69cce refactor: auto-enable auto-memory in agent mode
Simplify auto-memory by always enabling it in agent mode instead of
requiring the --auto-memory flag. This makes sense because:
- Agent mode is non-interactive, so blocking is acceptable
- Agents benefit from automatically saving discoveries to memory
- Reduces flag complexity for users

The --auto-memory flag still works for other modes if desired.
2026-01-11 15:56:27 +05:30
Dhanji R. Prasanna
1575cafc4b fix: add --auto-memory support to agent mode
The --auto-memory flag was not being passed to run_agent_mode() and
send_auto_memory_reminder() was not being called after agent task
execution.

Changes:
- Pass auto_memory parameter to run_agent_mode()
- Add auto_memory parameter to run_agent_mode() function signature
- Call agent.set_auto_memory(true) when flag is enabled
- Call send_auto_memory_reminder() after execute_task() in agent mode
2026-01-11 08:03:46 +08:00
Dhanji R. Prasanna
280ae1fcbb feat: add --auto-memory flag to prompt LLM to save discoveries
Adds a new --auto-memory CLI flag that automatically sends a reminder
to the LLM after each turn where tools were called, prompting it to
call the remember tool if it discovered any key code locations.

Changes:
- Add auto_memory field and set_auto_memory() method to Agent
- Add tool_calls_this_turn tracking in execute_tool_in_dir()
- Add send_auto_memory_reminder() that sends reminder after tool use
- Add --auto-memory CLI flag and wire it up in console/machine modes
- Call send_auto_memory_reminder() in single-shot and interactive modes
- Add visible status messages for auto-memory actions

Fixes bug where tool calls were not being tracked when execute_tool_in_dir
was called directly with working_dir=None.
2026-01-11 08:00:51 +08:00
Dhanji R. Prasanna
33c1aba86e Show human-readable descriptions in /resume session list
- Add description field to SessionContinuation struct
- Extract first user message (truncated to ~60 chars at word boundary)
- Display as quoted text instead of session ID hash
- Fall back to session ID if no description available

Example: [2 hours ago] 'when I call /resume it only shows me 2 sessions...'
2026-01-11 06:22:20 +08:00
Dhanji R. Prasanna
3fcef587e8 Fix /resume to show all sessions and use human-readable timestamps
- Change run_autonomous to return Agent instead of () so session
  continuation is properly saved in accumulative mode
- Update format_session_time to show relative times ("2 hours ago",
  "yesterday") for recent sessions and dates for older ones
- Handle Ctrl+C cancellation gracefully with informative message
2026-01-11 06:13:27 +08:00
Dhanji R. Prasanna
8926775acb Add session continuation symlink fix and /resume command
Fix session detection:
- Add save_session_continuation() calls at all session exit points
- Sessions now properly create .g3/session symlink for resume detection
- Fixes issue where g3 wasn't offering to resume previous sessions

Add /resume command:
- New list_sessions_for_directory() to scan available sessions
- New switch_to_session() method to safely switch between sessions
- Shows numbered list with timestamps, context %, and TODO status
- Saves current session before switching (can be resumed later)
- Restores full context if <80% used, otherwise uses summary
- Machine mode supports /resume and /resume <number>

Documentation:
- Add /clear and /resume to CONTROL_COMMANDS.md
- Update /help output with new commands
2026-01-11 05:30:58 +08:00
Dhanji R. Prasanna
9bef7753bf Add Chrome headless diagnostic tool
Runs automatically when --chrome-headless flag is used, checking:
- ChromeDriver installation and PATH
- Chrome/Chromium installation
- Chrome and ChromeDriver version compatibility
- config.toml chrome_binary setting
- Chrome for Testing installation
- ChromeDriver executable permissions (macOS quarantine)

Displays a detailed report with:
- Summary of detected versions and paths
- Pass/warning/error status for each check
- Specific fix suggestions for any issues found

Users can then ask g3 to help fix any detected issues.
2026-01-10 20:44:23 +11:00
Dhanji R. Prasanna
ea582766ba chrome-headless falg 2026-01-10 16:14:14 +11:00
Dhanji R. Prasanna
0aa1287ca6 Remove final_output tool and improve scout report handback
final_output removal:
- Remove final_output from tool definitions and dispatch
- Update system prompts to request summaries as regular text
- Remove final_output_called field from StreamingState
- Update auto_continue tests to remove final_output_called parameter
- Remove final_output test from tool_execution_test.rs
- Update planner and flock prompts to not reference final_output
- Keep backwards-compat code in feedback_extraction.rs and task_result.rs

Scout report handback:
- Change from file-based to delimiter-based report extraction
- Scout outputs report between ---SCOUT_REPORT_START/END--- markers
- Research tool extracts content between markers, strips ANSI codes
- Add comprehensive tests for extraction and ANSI stripping

657 tests pass.
2026-01-10 13:43:04 +11:00
Dhanji R. Prasanna
c88ffa2431 Remove final_output tool, improve scout agent
- Remove final_output tool to allow LLM responses to stream naturally
- Update system prompts to request summaries instead of tool calls
- Rename final_output_summary to summary in session continuation
- Update tool count tests (12→11 core tools, 27→26 total)
- Delete obsolete final_output tests

Scout agent improvements:
- Simplify WebDriver usage instructions
- Prefer DuckDuckGo/Brave/Bing over Google
- Support passing task directly to agent mode
- Suppress completion message for scout (needs clean output for research tool)
2026-01-09 20:30:00 +11:00
Dhanji R. Prasanna
777191b3cb Remove final_output tool - let summaries stream naturally
- Remove final_output from tool definitions, dispatch, and misc tools
- Update system prompts to request summaries as regular markdown text
- Remove print_final_output from UiWriter trait and all implementations
- Remove final_output handling from agent core logic
- Rename final_output_summary → summary in session continuation
- Delete final_output test files
- Update tool count tests (12→11, 27→26)

This allows LLM summaries to stream through the markdown formatter
for a more natural, responsive user experience instead of buffering
everything into a tool call.
2026-01-09 14:57:24 +11:00
Dhanji R. Prasanna
67be0f20c7 fix: remove allow_multiple_tool_calls config and simplify tool execution flow
This fixes a bug where the agent would stop responding abruptly without
calling final_output. The root cause was the allow_multiple_tool_calls
config option (default: false) which caused the agent to break out of
the streaming loop mid-stream after executing the first tool, losing
any subsequent content.

Changes:
- Remove allow_multiple_tool_calls config option entirely
- Always process all tool calls without breaking mid-stream
- Simplify system prompt generation (no longer needs boolean param)
- Let the stream complete fully before continuing to next iteration
- Change find_last_tool_call_start to find_first_tool_call_start
- Remove parser.reset() call on duplicate detection

Benefits:
- Simpler logic with less conditional branching
- No lost content after tool calls
- Consistent behavior for all users
- Reduced config complexity
2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna
df706308ca Unify final_output rendering with streaming markdown formatter
Replace the separate syntax_highlight module with the streaming markdown
formatter for final_output rendering. This:

- Removes special buffered rendering logic for final_output
- Uses the same StreamingMarkdownFormatter used for agent responses
- Removes the spinner animation (content renders immediately)
- Deletes the now-unused syntax_highlight.rs module
- Updates test to use the streaming formatter

Benefits:
- Consistent rendering across all markdown output
- Less code to maintain (removed ~250 lines)
- Same syntax highlighting via syntect (already in streaming formatter)
2026-01-08 20:30:44 +11:00
Dhanji R. Prasanna
347513b04c Add comprehensive stress tests for streaming markdown formatter
Add 10 stress tests covering:
- Nested formatting (bold in italic, italic in bold)
- Empty/minimal content edge cases
- Escape sequences and special characters
- Lists with complex inline formatting
- Links with various content types
- Tables with formatting in cells
- Code blocks (should not format contents)
- Mixed block elements (headers, quotes, rules)
- Nested lists (3+ levels, mixed types)
- Pathological/adversarial inputs (unbalanced delimiters, unicode, long lines)

All 45 tests pass.
2026-01-08 20:27:28 +11:00
Dhanji R. Prasanna
5bfaee8dd5 use consistent naming for compaction 2026-01-08 12:54:03 +11:00
Dhanji R. Prasanna
4e7aca50fa feat: royal blue tool names in agent mode + fix README heading display
- Add set_agent_mode() to UiWriter trait for visual mode differentiation
- ConsoleUiWriter uses royal blue (ANSI 256 color 69) for tool names in agent mode
- Fix extract_readme_heading() to search only README section of combined content
  (was incorrectly showing AGENTS.md heading instead of README heading)
2026-01-07 11:37:51 +11:00
Dhanji R. Prasanna
c4ae85de72 Add --new-session flag to skip session resumption in agent mode
Adds a new CLI flag that allows users to force a new session when running
in agent mode, bypassing the automatic detection and resumption of
incomplete sessions.

Usage: g3 --agent my-agent --new-session
2026-01-07 09:59:15 +11:00
Dhanji R. Prasanna
386176899e Remove vision tools (except take_screenshot) and macax tools
Vision tools removed:
- extract_text (OCR from image files)
- extract_text_with_boxes (OCR with bounding boxes)
- vision_find_text (find text in app windows)
- vision_click_text (find and click on text)
- vision_click_near_text (click near text labels)

macax tools removed:
- macax_list_apps
- macax_get_frontmost_app
- macax_activate_app
- macax_press_key
- macax_type_text

The LLM can now read images directly via read_image tool.
take_screenshot is retained for capturing application windows.

Files deleted:
- crates/g3-core/src/tools/vision.rs
- crates/g3-core/src/tools/macax.rs
- docs/macax-tools.md

Updated tool counts: 12 core + 15 webdriver = 27 total
2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna
76bfb77f84 further fowler fixes and session fixes 2026-01-03 15:47:04 +11:00
Dhanji R. Prasanna
595ad6ad21 agent mode resumption 2026-01-03 14:50:08 +11:00
Dhanji R. Prasanna
8d071d5eed fix: fowler agent now respects --workspace flag and reads project docs
- Fixed run_agent_mode to call std::env::set_current_dir with workspace_dir
- Updated fowler.md to read README.md and AGENTS.md as part of Triage & Understanding step
2025-12-26 15:24:20 +11:00
Dhanji R. Prasanna
7e59e181f7 context line ui 2025-12-26 12:58:13 +11:00
Dhanji R. Prasanna
923def0ab2 Convert all INFO logs to DEBUG to reduce CLI noise
Converted ~77 info! macro calls to debug! across the codebase to prevent
log messages from interrupting the CLI experience during normal operation.
Users can still see these logs by setting RUST_LOG=debug if needed.

Affected crates:
- g3-cli
- g3-computer-control
- g3-console
- g3-core
- g3-ensembles
- g3-execution
- g3-providers
2025-12-22 16:27:35 +11:00
Dhanji R. Prasanna
3bc254962c clean up filter_json a bit (more to come) 2025-12-22 12:03:09 +11:00
Dhanji R. Prasanna
01a5284d6d Move fixed_filter_json from g3-core to g3-cli
Properly separates UI display concern from core library:
- fixed_filter_json module now lives in g3-cli (UI layer)
- UiWriter trait gains filter_json_tool_calls() and reset_json_filter() methods
- g3-core delegates filtering to UI layer via trait methods
- Different UiWriter implementations can choose their own filtering behavior
- ConsoleUiWriter filters JSON tool calls for clean terminal display
- MachineUiWriter/NullUiWriter use default pass-through

Benefits:
- Proper separation of concerns
- Core stays clean without display-specific logic
- Testability - filter can be tested independently in g3-cli
2025-12-22 10:32:21 +11:00
Dhanji R. Prasanna
fbf31e5f68 Fix continuation errors: auto-continue when final_output not called
- Add final_output_called flag to track if LLM properly completed
- Auto-continue with prompt if tools executed but final_output missing
- Remove unused last_action_was_tool and any_text_response variables
- Simplifies previous complex incomplete response detection logic
2025-12-20 15:32:12 +11:00
Dhanji R. Prasanna
e771382bd0 agent mode + fowler bot 2025-12-19 16:14:03 +11:00
Dhanji R. Prasanna
faa6512b1f Revert to Safari as default WebDriver browser
Chrome headless has too many issues:
- Session creation hangs when Chrome is already running
- Cloudflare and other bot protection blocks headless browsers
- Version mismatch issues between Chrome and ChromeDriver

Safari is more reliable for web automation on macOS.
Chrome headless is still available via --chrome-headless flag.
2025-12-16 12:36:18 +11:00
Dhanji R. Prasanna
3d1b86d24b Make Chrome headless the default WebDriver browser
- Add --safari flag to CLI for explicitly choosing Safari
- Update --chrome-headless flag description to indicate it's the default
- Update README to reflect Chrome headless as default
- Remove broken link to non-existent docs/webdriver-setup.md
- Add Safari flag handling in all webdriver config locations

The config already had ChromeHeadless as the default, this commit
updates the CLI and documentation to match.
2025-12-15 16:51:42 +11:00
Jochen
87bceba54f Fix planner UI whitespace and workspace logs directory
Resolve two critical issues in planner mode that persisted through
multiple fix attempts:

1. Remove excessive whitespace between tool call displays by replacing
   direct println!() calls with ui_writer methods and eliminating
   redundant newlines in agent response streaming.

2. Ensure all log files (errors, sessions, tool calls, context dumps)
   are written to <workspace>/logs instead of codepath by properly
   initializing G3_WORKSPACE_PATH from --workspace argument.
2025-12-10 16:18:49 +11:00
Jochen
ff8b3e7c7b Implement planning mode 2025-12-09 17:03:53 +11:00
Dhanji R. Prasanna
678403da35 add a force thinnify cmd 2025-12-05 15:32:13 +11:00
Jochen
0327a6dfdf make sure coach feedback is extracted. 2025-12-02 22:00:58 +11:00
Jochen
928f2bfa9d actually record coach feedback and use it 2025-12-02 21:23:50 +11:00
Jochen
6dcae1e3f4 fix use import 2025-11-28 10:21:06 +11:00
Jochen
0d504d6422 temporarily disable codebase_fast_start
it seems the llm gets "lazy" and assumes all the tool
calls meant it's done most of the work.
I need to revise this approach.
2025-11-27 21:02:01 +11:00
Jochen
52f78653b4 add context window monitor
Writes the current context window to logs/current_context_window (uses a symlink to a session ID).

This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.
2025-11-27 21:00:02 +11:00
Jochen
7e1ce36a4b Merge pull request #35 from dhanji/jochen_write_existing_file
remove check for whether a file exists in the workspace
2025-11-27 13:44:45 +11:00
Jochen
9f6592efc2 remove redundant 'if' 2025-11-27 13:34:54 +11:00
Jochen
99125fc39e completely remove the skipping first player logic 2025-11-27 13:21:40 +11:00
Jochen
c58aa80932 explain what file was found in workspace 2025-11-26 21:43:59 +11:00
Dhanji Prasanna
4cfa0147ca first cut of horizontal partitioning
# Conflicts:
#	Cargo.lock

# Conflicts:
#	Cargo.lock
#	crates/g3-cli/src/lib.rs
2025-11-26 17:12:07 +11:00
Jochen
c19127f809 make sure user requirements are included 2025-11-26 10:26:52 +11:00
Jochen
2e252cd298 added timer 2025-11-25 22:51:33 +11:00
Jochen
ad198a8501 add code exploration fast start
This tries to short-circuit multiple round-trips to llm for reading code.
It's a precursor to trying to context engineer tailored to specific tasks.
In initial experiments, it's only marginally faster than regular mode, and burns more tokens.
2025-11-25 22:51:32 +11:00