Remove vision tools (except take_screenshot) and macax tools

Vision tools removed:
- extract_text (OCR from image files)
- extract_text_with_boxes (OCR with bounding boxes)
- vision_find_text (find text in app windows)
- vision_click_text (find and click on text)
- vision_click_near_text (click near text labels)

macax tools removed:
- macax_list_apps
- macax_get_frontmost_app
- macax_activate_app
- macax_press_key
- macax_type_text

The LLM can now read images directly via read_image tool.
take_screenshot is retained for capturing application windows.

Files deleted:
- crates/g3-core/src/tools/vision.rs
- crates/g3-core/src/tools/macax.rs
- docs/macax-tools.md

Updated tool counts: 12 core + 15 webdriver = 27 total
This commit is contained in:
Dhanji R. Prasanna
2026-01-03 17:38:25 +11:00
parent 29e263ac49
commit 386176899e
19 changed files with 15 additions and 1408 deletions

View File

@@ -106,7 +106,6 @@ g3/
- `type_text`: Type text at the current cursor position
- `find_element`: Find UI elements by text, role, or attributes
- `take_screenshot`: Capture screenshots of screen, region, or window
- `extract_text`: Extract text from images or screen regions using OCR
- `find_text_on_screen`: Find text visually on screen and return coordinates
- `list_windows`: List all open windows with IDs and titles