12 KiB
G3 Tools Reference
Last updated: January 2025
Source of truth: crates/g3-core/src/tool_definitions.rs, crates/g3-core/src/tools/
Purpose
This document describes all tools available to the G3 agent. Tools are the primary mechanism by which G3 interacts with the filesystem, executes commands, and automates tasks.
Tool Categories
| Category | Tools | Enabled By |
|---|---|---|
| Core | shell, read_file, write_file, str_replace, final_output, background_process | Always |
| Images | read_image, take_screenshot, extract_text | Always |
| Task Management | todo_read, todo_write | Always |
| Code Intelligence | code_search, code_coverage | Always |
| WebDriver | webdriver_* (12 tools) | --webdriver or --chrome-headless |
| Vision | vision_find_text, vision_click_text, vision_click_near_text | Always (macOS) |
| macOS Accessibility | macax_* (9 tools) | --macax |
| Computer Control | mouse_click, type_text, find_element, list_windows | computer_control.enabled = true |
Core Tools
shell
Execute shell commands.
Parameters:
command(string, required): The shell command to execute
Example:
{"tool": "shell", "args": {"command": "ls -la"}}
Notes:
- Commands run in the current working directory
- Output is streamed in real-time
- Both stdout and stderr are captured
- Exit code is reported
background_process
Launch a long-running process in the background.
Parameters:
name(string, required): Unique name for the process (e.g., "game_server")command(string, required): Shell command to executeworking_dir(string, optional): Working directory
Example:
{"tool": "background_process", "args": {"name": "dev_server", "command": "npm run dev"}}
Returns: PID and log file path
Notes:
- Process runs independently of the agent
- Logs are captured to a file
- Use
shellto read logs (tail), check status (ps), or stop (kill)
read_file
Read file contents with optional character range.
Parameters:
file_path(string, required): Path to the filestart(integer, optional): Starting character position (0-indexed, inclusive)end(integer, optional): Ending character position (0-indexed, exclusive)
Example:
{"tool": "read_file", "args": {"file_path": "src/main.rs", "start": 0, "end": 1000}}
Notes:
- For image files (png, jpg, gif, etc.), automatically extracts text using OCR
- Supports tilde expansion (
~) - Reports file size and line count
read_image
Read image files for visual analysis by the LLM.
Parameters:
file_paths(array of strings, required): Paths to image files
Example:
{"tool": "read_image", "args": {"file_paths": ["screenshot.png", "diagram.jpg"]}}
Supported formats: PNG, JPEG, GIF, WebP
Notes:
- Images are sent to the LLM for visual analysis
- Use for inspecting sprites, UI screenshots, diagrams, etc.
- Different from
extract_textwhich only does OCR
write_file
Create or overwrite a file.
Parameters:
file_path(string, required): Path to the filecontent(string, required): Content to write
Example:
{"tool": "write_file", "args": {"file_path": "hello.txt", "content": "Hello, world!"}}
Notes:
- Creates parent directories if needed
- Overwrites existing files
- Reports bytes written
str_replace
Apply a unified diff to a file.
Parameters:
file_path(string, required): Path to the filediff(string, required): Unified diff with context linesstart(integer, optional): Starting character position to constrain searchend(integer, optional): Ending character position to constrain search
Example:
{"tool": "str_replace", "args": {
"file_path": "src/main.rs",
"diff": "@@ -10,3 +10,4 @@\n fn main() {\n println!(\"Hello\");\n+ println!(\"World\");\n }"
}}
Notes:
- Supports multiple hunks
- Context lines help locate the correct position
- Use
start/endto disambiguate when multiple matches exist ---/+++headers are optional for minimal diffs
final_output
Signal task completion with a summary.
Parameters:
summary(string, required): Markdown summary of what was accomplished
Example:
{"tool": "final_output", "args": {"summary": "## Completed\n\n- Created user authentication module\n- Added unit tests\n- Updated documentation"}}
Notes:
- Ends the current task
- Summary is displayed to the user
- In autonomous mode, triggers coach review
Image & Screenshot Tools
take_screenshot
Capture a screenshot of an application window.
Parameters:
path(string, required): Filename for the screenshotwindow_id(string, required): Application name (e.g., "Safari", "Terminal")region(object, optional):{x, y, width, height}to capture a region
Example:
{"tool": "take_screenshot", "args": {"path": "safari.png", "window_id": "Safari"}}
Notes:
- Use
list_windowsfirst to identify available windows - Relative paths save to
~/tmpor$TMPDIR - Uses native screencapture on macOS
extract_text
Extract text from an image using OCR.
Parameters:
path(string, optional): Path to image file
Example:
{"tool": "extract_text", "args": {"path": "screenshot.png"}}
Notes:
- Uses Tesseract OCR or Apple Vision framework
- For window-based OCR, use
vision_find_textinstead
Task Management Tools
todo_read
Read the current TODO list.
Parameters: None
Example:
{"tool": "todo_read", "args": {}}
Notes:
- TODO lists are session-scoped
- Stored in
.g3/sessions/<session_id>/todo.g3.md - Call at start of multi-step tasks to check for existing plans
todo_write
Create or update the TODO list.
Parameters:
content(string, required): TODO list content in markdown checkbox format
Example:
{"tool": "todo_write", "args": {"content": "- [ ] Implement feature\n - [ ] Write tests\n - [ ] Update docs\n- [x] Setup project"}}
Notes:
- Replaces entire file content
- Always call
todo_readfirst to preserve existing content - Use
- [ ]for incomplete,- [x]for complete - Supports nested tasks with indentation
Code Intelligence Tools
code_search
Syntax-aware code search using tree-sitter.
Parameters:
searches(array, required): Array of search objects:name(string): Label for this searchquery(string): Tree-sitter query in S-expression formatlanguage(string): Programming languagepaths(array, optional): Paths to searchcontext_lines(integer, optional): Lines of context (0-20)
max_concurrency(integer, optional): Parallel searches (default: 4)max_matches_per_search(integer, optional): Max matches (default: 500)
Supported languages: rust, python, javascript, typescript, go, java, c, cpp, kotlin
Example:
{"tool": "code_search", "args": {
"searches": [{
"name": "functions",
"query": "(function_item name: (identifier) @name)",
"language": "rust",
"context_lines": 2
}]
}}
See Code Search Guide for detailed query patterns.
code_coverage
Generate code coverage report using cargo llvm-cov.
Parameters: None
Example:
{"tool": "code_coverage", "args": {}}
Notes:
- Runs all tests with coverage instrumentation
- Auto-installs llvm-tools-preview and cargo-llvm-cov if missing
- Returns coverage statistics summary
WebDriver Tools
Enabled with --webdriver (Safari) or --chrome-headless (Chrome).
webdriver_start
Start a browser session.
Example:
{"tool": "webdriver_start", "args": {}}
webdriver_navigate
Navigate to a URL.
Parameters:
url(string, required): URL with protocol (e.g.,https://)
webdriver_get_url / webdriver_get_title
Get current URL or page title.
webdriver_find_element / webdriver_find_elements
Find element(s) by CSS selector.
Parameters:
selector(string, required): CSS selector
webdriver_click
Click an element.
Parameters:
selector(string, required): CSS selector
webdriver_send_keys
Type text into an input.
Parameters:
selector(string, required): CSS selectortext(string, required): Text to typeclear_first(boolean, optional): Clear before typing (default: true)
webdriver_execute_script
Execute JavaScript.
Parameters:
script(string, required): JavaScript code (usereturnto return values)
webdriver_get_page_source
Get rendered HTML.
Parameters:
max_length(integer, optional): Max chars to return (default: 10000, 0 for no limit)save_to_file(string, optional): Save to file instead of returning inline
webdriver_screenshot
Take browser screenshot.
Parameters:
path(string, required): Save path
webdriver_back / webdriver_forward / webdriver_refresh
Navigation controls.
webdriver_quit
Close browser and end session.
Vision Tools (macOS)
Use Apple Vision framework for text recognition.
vision_find_text
Find text in an application window.
Parameters:
app_name(string, required): Application nametext(string, required): Text to search for
Returns: Bounding box coordinates and confidence score
vision_click_text
Find and click on text.
Parameters:
app_name(string, required): Application nametext(string, required): Text to click
vision_click_near_text
Click near a text label (useful for form fields).
Parameters:
app_name(string, required): Application nametext(string, required): Label text to finddirection(string, optional): "right", "below", "left", "above" (default: "right")distance(integer, optional): Pixels from text (default: 50)
macOS Accessibility Tools
Enabled with --macax. See macOS Accessibility Tools Guide.
macax_list_apps
List running applications.
macax_get_frontmost_app
Get the frontmost application.
macax_activate_app
Bring an application to front.
Parameters:
app_name(string, required): Application name
macax_get_ui_tree
Get UI element hierarchy.
Parameters:
app_name(string, required): Application namemax_depth(integer, optional): Tree depth limit
macax_find_elements
Find UI elements by criteria.
Parameters:
app_name(string, required): Application namerole(string, optional): Element role (button, textField, etc.)title(string, optional): Element titleidentifier(string, optional): Accessibility identifier
macax_click
Click a UI element.
Parameters:
app_name(string, required): Application nameidentifierortitleorrole: Element selector
macax_set_value / macax_get_value
Set or get element value.
macax_press_key
Simulate key press.
Parameters:
key(string, required): Key to pressmodifiers(array, optional): ["command", "shift", "option", "control"]
Computer Control Tools
Enabled with computer_control.enabled = true in config.
mouse_click
Click at coordinates.
Parameters:
x(integer, required): X coordinatey(integer, required): Y coordinatebutton(string, optional): "left", "right", "middle"
type_text
Type text at cursor.
Parameters:
text(string, required): Text to type
find_element
Find UI element by text, role, or attributes.
list_windows
List all open windows with IDs and titles.
Tool Execution Notes
Duplicate Detection
G3 prevents accidental duplicate tool calls:
- Only immediately sequential identical calls are blocked
- Text between tool calls resets detection
- Tools can be reused throughout a session
Error Handling
Tool errors are reported back to the agent, which can:
- Retry with different parameters
- Try an alternative approach
- Report the issue to the user
Working Directory
Tools execute in:
- Directory specified by
--codebase-fast-startif provided - Current working directory otherwise
File Paths
- Tilde expansion (
~) is supported - Relative paths are relative to working directory
- Screenshots default to
~/tmpor$TMPDIR