Commit Graph

16 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
e731bc8217 Make remember tool instructions more imperative in system prompts
- Change 'call remember' to 'you MUST call remember' in native prompt
- Change 'IF you discovered' to 'ALWAYS...when you discovered'
- Add explicit list of trigger tools (code_search, rg, grep, find, read_file)
- Add reminder to Response Guidelines section
- Add remember tool and Project Memory section to non-native prompt
- Remove redundant console output from remember tool
- Fix test compilation errors (missing summary parameter, temporary borrow)
2026-01-11 06:49:45 +08:00
Dhanji R. Prasanna
1090e30d6c Simplify system prompt: remove coding style and parallel tool call sections
- Remove IMPORTANT FOR CODING section (~1,500 chars of coding guidelines)
- Remove <use_parallel_tool_calls> block (~500 chars)
- Remove unused const_format dependency from g3-core
- Simplify get_system_prompt_for_native() to just return base prompt
- Response Guidelines now cleanly ends the static prompt

Prompt reduced from ~8,500 to ~6,500 characters.
2026-01-11 06:35:18 +08:00
Dhanji R. Prasanna
86709834e2 Improve research tool error reporting for scout agent failures
When the scout agent fails (e.g., context window exhaustion), now:
- Captures both stdout and stderr from the scout process
- Detects context window exhaustion errors with specific patterns
- Provides detailed, actionable error messages to the user
- Shows suggestions for how to work around the issue
- Includes technical details (exit code, error output) for debugging

Handles two failure modes:
1. Scout agent exits with non-zero status
2. Scout agent exits successfully but doesn't produce valid report markers

Both cases now surface clear error messages instead of cryptic failures.
2026-01-10 20:50:43 +11:00
Dhanji R. Prasanna
60aeb67c56 Add stealth mode for Chrome headless to evade bot detection
Implements comprehensive anti-detection measures:
- Override navigator.webdriver to return undefined
- Inject fake chrome.runtime, chrome.loadTimes, chrome.csi objects
- Add realistic plugins and mimeTypes arrays
- Patch permissions API to hide automation
- Set realistic navigator properties (languages, hardwareConcurrency, deviceMemory)
- Remove ChromeDriver-specific window properties (cdc_*)
- Patch Function.prototype.toString to hide modifications
- Add Chrome flags: --disable-blink-features=AutomationControlled
- Set realistic user-agent without HeadlessChrome identifier
- Exclude 'enable-automation' switch

Tested against bot detection sites:
- bot.sannysoft.com: All major tests pass
- Search engines: Works with DuckDuckGo, Yahoo, Brave, Startpage
- Still detected by: Google reCAPTCHA, Cloudflare Turnstile, Bing
2026-01-10 20:34:14 +11:00
Dhanji R. Prasanna
6be0a03c4c Fix timing footer being saved to context window
The timing footer (e.g., ⏱️ 19.4s | 💭 4.7s) was being saved to the
conversation history as a separate assistant message. This happened
because stream_completion_with_tools returns the timing footer in
TaskResult.response for display, but the caller was also saving it
to context.

Fix: Strip the timing footer (identified by \n\n⏱️) before saving
to context window. The timing footer remains display-only.

Also includes:
- Research tool blank line fix: only add visual separator for research
  tool output, not all tools
- Research tool webdriver propagation: pass parent's webdriver browser
  choice (Safari vs Chrome headless) to scout subprocess
2026-01-10 15:55:59 +11:00
Dhanji R. Prasanna
68c9135913 Fix research tool UI: remove duplicate header, add footer spacing, remove spinner, widen command display
- Remove duplicate tool header (lib.rs already prints it)
- Add newline before timing footer for visual separation
- Remove spinner animation (incompatible with update_tool_output_line)
- Change shell command format to " > `cmd` ..." with 60 char width
2026-01-10 15:20:40 +11:00
Dhanji R. Prasanna
0aa1287ca6 Remove final_output tool and improve scout report handback
final_output removal:
- Remove final_output from tool definitions and dispatch
- Update system prompts to request summaries as regular text
- Remove final_output_called field from StreamingState
- Update auto_continue tests to remove final_output_called parameter
- Remove final_output test from tool_execution_test.rs
- Update planner and flock prompts to not reference final_output
- Keep backwards-compat code in feedback_extraction.rs and task_result.rs

Scout report handback:
- Change from file-based to delimiter-based report extraction
- Scout outputs report between ---SCOUT_REPORT_START/END--- markers
- Research tool extracts content between markers, strips ANSI codes
- Add comprehensive tests for extraction and ANSI stripping

657 tests pass.
2026-01-10 13:43:04 +11:00
Dhanji R. Prasanna
cab2fb187a Stream scout agent output to CLI during research
The research tool now streams the underlying scout agent's output
to the CLI in real-time for visual indication of progress. This
output is displayed but not added to the conversation context.
2026-01-09 20:39:53 +11:00
Dhanji R. Prasanna
33e5705fc3 Add research tool for web-based research via scout agent
New tool that spawns a scout agent to perform web research and return
a structured research brief. The scout agent uses webdriver to browse
the web and returns a decision-ready report.

Changes:
- Added 'research' tool definition (12 core tools total)
- Added research tool dispatch in tool_dispatch.rs
- Created tools/research.rs implementation:
  - Spawns 'g3 --agent scout <query>' as subprocess
  - Captures stdout and extracts last line (report file path)
  - Reads and returns the report file contents
- Added exclude_research flag to ToolConfig
- Scout agent (agent_name == 'scout') does NOT have access to research
  tool to prevent infinite recursion
- Updated system prompts to describe when to use research tool
- Added scout.md agent prompt with research brief output contract

The research tool is preferred for complex research tasks (APIs, SDKs,
libraries, approaches, bugs). WebDriver can still be used directly for
simple lookups or fine-grained control.
2026-01-09 15:59:19 +11:00
Dhanji R. Prasanna
777191b3cb Remove final_output tool - let summaries stream naturally
- Remove final_output from tool definitions, dispatch, and misc tools
- Update system prompts to request summaries as regular markdown text
- Remove print_final_output from UiWriter trait and all implementations
- Remove final_output handling from agent core logic
- Rename final_output_summary → summary in session continuation
- Delete final_output test files
- Update tool count tests (12→11, 27→26)

This allows LLM summaries to stream through the markdown formatter
for a more natural, responsive user experience instead of buffering
everything into a tool call.
2026-01-09 14:57:24 +11:00
Dhanji R. Prasanna
67be0f20c7 fix: remove allow_multiple_tool_calls config and simplify tool execution flow
This fixes a bug where the agent would stop responding abruptly without
calling final_output. The root cause was the allow_multiple_tool_calls
config option (default: false) which caused the agent to break out of
the streaming loop mid-stream after executing the first tool, losing
any subsequent content.

Changes:
- Remove allow_multiple_tool_calls config option entirely
- Always process all tool calls without breaking mid-stream
- Simplify system prompt generation (no longer needs boolean param)
- Let the stream complete fully before continuing to next iteration
- Change find_last_tool_call_start to find_first_tool_call_start
- Remove parser.reset() call on duplicate detection

Benefits:
- Simpler logic with less conditional branching
- No lost content after tool calls
- Consistent behavior for all users
- Reduced config complexity
2026-01-09 13:28:07 +11:00
Dhanji R. Prasanna
267ef00848 refactor: extract session helper in webdriver.rs to reduce boilerplate
Agent: carmack

Add get_session() helper function that:
- Checks if webdriver is enabled
- Acquires the session read lock
- Returns the cloned session or an error message

Refactored 12 webdriver tool functions to use this helper:
- execute_webdriver_navigate
- execute_webdriver_get_url
- execute_webdriver_get_title
- execute_webdriver_find_element
- execute_webdriver_find_elements
- execute_webdriver_click
- execute_webdriver_send_keys
- execute_webdriver_execute_script
- execute_webdriver_get_page_source
- execute_webdriver_screenshot
- execute_webdriver_back
- execute_webdriver_forward
- execute_webdriver_refresh

Each function previously had ~10 lines of identical boilerplate.
Now reduced to 4 lines using the helper.

Net reduction: 68 lines (678 -> 610)
All tests pass. Behavior unchanged.
2026-01-08 13:05:44 +11:00
Dhanji R. Prasanna
bb63050779 refactor: improve readability of streaming and file ops code
Agent: carmack

databricks.rs:
- Extract ToolCallAccumulator struct to replace opaque (String, String, String) tuple
- Add decode_utf8_streaming() helper for cleaner UTF-8 handling
- Add is_incomplete_json_error() helper for JSON parse error detection
- Add make_final_chunk() helper to reduce duplication
- Add finalize_tool_calls() to convert accumulators to final format
- Refactor parse_streaming_response from ~270 lines to ~100 lines
- Reduce nesting depth from 8+ levels to 4 levels
- Use early returns and let-else for cleaner control flow

file_ops.rs:
- Replace repetitive if-let chains with declarative PATH_CONTENT_KEYS table
- Use match expression instead of nested if-else
- Reduce extract_path_and_content from 44 lines to 20 lines

All tests pass. Behavior unchanged.
2026-01-07 12:39:05 +11:00
Dhanji R. Prasanna
386176899e Remove vision tools (except take_screenshot) and macax tools
Vision tools removed:
- extract_text (OCR from image files)
- extract_text_with_boxes (OCR with bounding boxes)
- vision_find_text (find text in app windows)
- vision_click_text (find and click on text)
- vision_click_near_text (click near text labels)

macax tools removed:
- macax_list_apps
- macax_get_frontmost_app
- macax_activate_app
- macax_press_key
- macax_type_text

The LLM can now read images directly via read_image tool.
take_screenshot is retained for capturing application windows.

Files deleted:
- crates/g3-core/src/tools/vision.rs
- crates/g3-core/src/tools/macax.rs
- docs/macax-tools.md

Updated tool counts: 12 core + 15 webdriver = 27 total
2026-01-03 17:38:25 +11:00
Dhanji R. Prasanna
29e263ac49 Fix Unicode space handling in macOS screenshot filenames
macOS uses U+202F (Narrow No-Break Space) in screenshot filenames
between the time and am/pm. When users type or paste these paths,
they use regular spaces, causing file-not-found errors.

Changes:
- Add resolve_path_with_unicode_fallback() to try U+202F variants
- Add resolve_paths_in_shell_command() for shell command paths
- Apply fix to read_file, read_image, and shell tools
- Fix read_image prompt docs: file_path -> file_paths (array)
- Add 6 unit tests for Unicode space normalization
2026-01-03 17:17:08 +11:00
Dhanji R. Prasanna
595ad6ad21 agent mode resumption 2026-01-03 14:50:08 +11:00