alex/g3

Files

Dhanji R. Prasanna f7e2f38fe9 lamport run

2026-01-03 16:48:30 +11:00

12 KiB

Raw Blame History

G3 Tools Reference

Last updated: January 2025
Source of truth: crates/g3-core/src/tool_definitions.rs, crates/g3-core/src/tools/

Purpose

This document describes all tools available to the G3 agent. Tools are the primary mechanism by which G3 interacts with the filesystem, executes commands, and automates tasks.

Tool Categories

Category	Tools	Enabled By
Core	shell, read_file, write_file, str_replace, final_output, background_process	Always
Images	read_image, take_screenshot, extract_text	Always
Task Management	todo_read, todo_write	Always
Code Intelligence	code_search, code_coverage	Always
WebDriver	webdriver_* (12 tools)	`--webdriver` or `--chrome-headless`
Vision	vision_find_text, vision_click_text, vision_click_near_text	Always (macOS)
macOS Accessibility	macax_* (9 tools)	`--macax`
Computer Control	mouse_click, type_text, find_element, list_windows	`computer_control.enabled = true`

Core Tools

shell

Execute shell commands.

Parameters:

command (string, required): The shell command to execute

Example:

{"tool": "shell", "args": {"command": "ls -la"}}

Notes:

Commands run in the current working directory
Output is streamed in real-time
Both stdout and stderr are captured
Exit code is reported

background_process

Launch a long-running process in the background.

Parameters:

name (string, required): Unique name for the process (e.g., "game_server")
command (string, required): Shell command to execute
working_dir (string, optional): Working directory

Example:

{"tool": "background_process", "args": {"name": "dev_server", "command": "npm run dev"}}

Returns: PID and log file path

Notes:

Process runs independently of the agent
Logs are captured to a file
Use shell to read logs (tail), check status (ps), or stop (kill)

read_file

Read file contents with optional character range.

Parameters:

file_path (string, required): Path to the file
start (integer, optional): Starting character position (0-indexed, inclusive)
end (integer, optional): Ending character position (0-indexed, exclusive)

Example:

{"tool": "read_file", "args": {"file_path": "src/main.rs", "start": 0, "end": 1000}}

Notes:

For image files (png, jpg, gif, etc.), automatically extracts text using OCR
Supports tilde expansion (~)
Reports file size and line count

read_image

Read image files for visual analysis by the LLM.

Parameters:

file_paths (array of strings, required): Paths to image files

Example:

{"tool": "read_image", "args": {"file_paths": ["screenshot.png", "diagram.jpg"]}}

Supported formats: PNG, JPEG, GIF, WebP

Notes:

Images are sent to the LLM for visual analysis
Use for inspecting sprites, UI screenshots, diagrams, etc.
Different from extract_text which only does OCR

write_file

Create or overwrite a file.

Parameters:

file_path (string, required): Path to the file
content (string, required): Content to write

Example:

{"tool": "write_file", "args": {"file_path": "hello.txt", "content": "Hello, world!"}}

Notes:

Creates parent directories if needed
Overwrites existing files
Reports bytes written

str_replace

Apply a unified diff to a file.

Parameters:

file_path (string, required): Path to the file
diff (string, required): Unified diff with context lines
start (integer, optional): Starting character position to constrain search
end (integer, optional): Ending character position to constrain search

Example:

{"tool": "str_replace", "args": {
  "file_path": "src/main.rs",
  "diff": "@@ -10,3 +10,4 @@\n fn main() {\n     println!(\"Hello\");\n+    println!(\"World\");\n }"
}}

Notes:

Supports multiple hunks
Context lines help locate the correct position
Use start/end to disambiguate when multiple matches exist
---/+++ headers are optional for minimal diffs

final_output

Signal task completion with a summary.

Parameters:

summary (string, required): Markdown summary of what was accomplished

Example:

{"tool": "final_output", "args": {"summary": "## Completed\n\n- Created user authentication module\n- Added unit tests\n- Updated documentation"}}

Notes:

Ends the current task
Summary is displayed to the user
In autonomous mode, triggers coach review

Image & Screenshot Tools

take_screenshot

Capture a screenshot of an application window.

Parameters:

path (string, required): Filename for the screenshot
window_id (string, required): Application name (e.g., "Safari", "Terminal")
region (object, optional): {x, y, width, height} to capture a region

Example:

{"tool": "take_screenshot", "args": {"path": "safari.png", "window_id": "Safari"}}

Notes:

Use list_windows first to identify available windows
Relative paths save to ~/tmp or $TMPDIR
Uses native screencapture on macOS

extract_text

Extract text from an image using OCR.

Parameters:

path (string, optional): Path to image file

Example:

{"tool": "extract_text", "args": {"path": "screenshot.png"}}

Notes:

Uses Tesseract OCR or Apple Vision framework
For window-based OCR, use vision_find_text instead

Task Management Tools

todo_read

Read the current TODO list.

Parameters: None

Example:

{"tool": "todo_read", "args": {}}

Notes:

TODO lists are session-scoped
Stored in .g3/sessions/<session_id>/todo.g3.md
Call at start of multi-step tasks to check for existing plans

todo_write

Create or update the TODO list.

Parameters:

content (string, required): TODO list content in markdown checkbox format

Example:

{"tool": "todo_write", "args": {"content": "- [ ] Implement feature\n  - [ ] Write tests\n  - [ ] Update docs\n- [x] Setup project"}}

Notes:

Replaces entire file content
Always call todo_read first to preserve existing content
Use - [ ] for incomplete, - [x] for complete
Supports nested tasks with indentation

Code Intelligence Tools

code_search

Syntax-aware code search using tree-sitter.

Parameters:

searches (array, required): Array of search objects:
- name (string): Label for this search
- query (string): Tree-sitter query in S-expression format
- language (string): Programming language
- paths (array, optional): Paths to search
- context_lines (integer, optional): Lines of context (0-20)
max_concurrency (integer, optional): Parallel searches (default: 4)
max_matches_per_search (integer, optional): Max matches (default: 500)

Supported languages: rust, python, javascript, typescript, go, java, c, cpp, kotlin

Example:

{"tool": "code_search", "args": {
  "searches": [{
    "name": "functions",
    "query": "(function_item name: (identifier) @name)",
    "language": "rust",
    "context_lines": 2
  }]
}}

See Code Search Guide for detailed query patterns.

code_coverage

Generate code coverage report using cargo llvm-cov.

Parameters: None

Example:

{"tool": "code_coverage", "args": {}}

Notes:

Runs all tests with coverage instrumentation
Auto-installs llvm-tools-preview and cargo-llvm-cov if missing
Returns coverage statistics summary

WebDriver Tools

Enabled with --webdriver (Safari) or --chrome-headless (Chrome).

webdriver_start

Start a browser session.

Example:

{"tool": "webdriver_start", "args": {}}

webdriver_navigate

Navigate to a URL.

Parameters:

url (string, required): URL with protocol (e.g., https://)

webdriver_get_url / webdriver_get_title

Get current URL or page title.

webdriver_find_element / webdriver_find_elements

Find element(s) by CSS selector.

Parameters:

selector (string, required): CSS selector

webdriver_click

Click an element.

Parameters:

selector (string, required): CSS selector

webdriver_send_keys

Type text into an input.

Parameters:

selector (string, required): CSS selector
text (string, required): Text to type
clear_first (boolean, optional): Clear before typing (default: true)

webdriver_execute_script

Execute JavaScript.

Parameters:

script (string, required): JavaScript code (use return to return values)

webdriver_get_page_source

Get rendered HTML.

Parameters:

max_length (integer, optional): Max chars to return (default: 10000, 0 for no limit)
save_to_file (string, optional): Save to file instead of returning inline

webdriver_screenshot

Take browser screenshot.

Parameters:

path (string, required): Save path

webdriver_back / webdriver_forward / webdriver_refresh

Navigation controls.

webdriver_quit

Close browser and end session.

Vision Tools (macOS)

Use Apple Vision framework for text recognition.

vision_find_text

Find text in an application window.

Parameters:

app_name (string, required): Application name
text (string, required): Text to search for

Returns: Bounding box coordinates and confidence score

vision_click_text

Find and click on text.

Parameters:

app_name (string, required): Application name
text (string, required): Text to click

vision_click_near_text

Click near a text label (useful for form fields).

Parameters:

app_name (string, required): Application name
text (string, required): Label text to find
direction (string, optional): "right", "below", "left", "above" (default: "right")
distance (integer, optional): Pixels from text (default: 50)

macOS Accessibility Tools

Enabled with --macax. See macOS Accessibility Tools Guide.

macax_list_apps

List running applications.

macax_get_frontmost_app

Get the frontmost application.

macax_activate_app

Bring an application to front.

Parameters:

app_name (string, required): Application name

macax_get_ui_tree

Get UI element hierarchy.

Parameters:

app_name (string, required): Application name
max_depth (integer, optional): Tree depth limit

macax_find_elements

Find UI elements by criteria.

Parameters:

app_name (string, required): Application name
role (string, optional): Element role (button, textField, etc.)
title (string, optional): Element title
identifier (string, optional): Accessibility identifier

macax_click

Click a UI element.

Parameters:

app_name (string, required): Application name
identifier or title or role: Element selector

macax_set_value / macax_get_value

Set or get element value.

macax_press_key

Simulate key press.

Parameters:

key (string, required): Key to press
modifiers (array, optional): ["command", "shift", "option", "control"]

Computer Control Tools

Enabled with computer_control.enabled = true in config.

mouse_click

Click at coordinates.

Parameters:

x (integer, required): X coordinate
y (integer, required): Y coordinate
button (string, optional): "left", "right", "middle"

type_text

Type text at cursor.

Parameters:

text (string, required): Text to type

find_element

Find UI element by text, role, or attributes.

list_windows

List all open windows with IDs and titles.

Tool Execution Notes

Duplicate Detection

G3 prevents accidental duplicate tool calls:

Only immediately sequential identical calls are blocked
Text between tool calls resets detection
Tools can be reused throughout a session

Error Handling

Tool errors are reported back to the agent, which can:

Retry with different parameters
Try an alternative approach
Report the issue to the user

Working Directory

Tools execute in:

Directory specified by --codebase-fast-start if provided
Current working directory otherwise

File Paths

Tilde expansion (~) is supported
Relative paths are relative to working directory
Screenshots default to ~/tmp or $TMPDIR

12 KiB Raw Blame History

G3 Tools Reference

Purpose

Tool Categories

Core Tools

shell

background_process

read_file

read_image

write_file

str_replace

final_output

Image & Screenshot Tools

take_screenshot

extract_text

Task Management Tools

todo_read

todo_write

Code Intelligence Tools

code_search

code_coverage

WebDriver Tools

webdriver_start

webdriver_navigate

webdriver_get_url / webdriver_get_title

webdriver_find_element / webdriver_find_elements

webdriver_click

webdriver_send_keys

webdriver_execute_script

webdriver_get_page_source

webdriver_screenshot

webdriver_back / webdriver_forward / webdriver_refresh

webdriver_quit

Vision Tools (macOS)

vision_find_text

vision_click_text

vision_click_near_text

macOS Accessibility Tools

macax_list_apps

macax_get_frontmost_app

macax_activate_app

macax_get_ui_tree

macax_find_elements

macax_click

macax_set_value / macax_get_value

macax_press_key

Computer Control Tools

mouse_click

type_text

find_element

list_windows

Tool Execution Notes

Duplicate Detection

Error Handling

Working Directory

File Paths

12 KiB

Raw Blame History