Commit Graph

69 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
a93ce932a3 refactor: Clean up Cargo dependencies - remove unused, update outdated
- Remove unused const_format from g3-planner (never imported)
- Remove unused thiserror from workspace and 5 crates (declared but never used)
- Update termimad 0.31 -> 0.34 in studio (consistency with g3-cli)
- Update indicatif 0.17 -> 0.18 in g3-cli
- Update ratatui 0.29 -> 0.30 in g3-cli
- Update walkdir 2.4 -> 2.5 in g3-core
- Update image 0.24 -> 0.25 in g3-computer-control (macOS + Linux)
- Update config 0.14 -> 0.15 in workspace

Blocked: reqwest 0.11 -> 0.12/0.13 requires breaking API changes to
bytes_stream() used in 4 providers - needs separate migration effort.

All tests pass. No behavior changes.

Agent: fowler
2026-02-06 14:22:59 +11:00
Dhanji R. Prasanna
abfac197ab Add datalog-based invariant verification system
Implement a new datalog verification layer using datafrog that:

- Compiles rulespec to datalog on plan_approve
- Extracts facts from action envelope using selectors
- Executes datalog rules on plan_verify
- Writes evaluation results to datalog_evaluation.txt (shadow mode)

Key components:
- crates/g3-core/src/tools/datalog.rs: Full datalog module with:
  - compile_rulespec(): Validates and compiles rulespec
  - extract_facts(): Extracts facts from envelope YAML
  - execute_rules(): Runs datafrog iteration
  - 23 comprehensive tests

- crates/g3-core/src/tools/plan.rs:
  - execute_plan_approve(): Now compiles rulespec on approval
  - shadow_datalog_verify(): Runs datalog and writes to eval file

Results are written to .g3/sessions/<id>/datalog_evaluation.txt
for inspection, NOT injected into context window (shadow mode).
2026-02-06 13:50:54 +11:00
Dhanji R. Prasanna
571188305a feat: add compact UI output for Plan Mode tools
Plan tools (plan_read, plan_write) now display with elegant tree-style
formatting similar to the old todo_write UI:

- State indicators: □ (todo), ◐ (doing), ■ (done), ⊘ (blocked)
- Tree prefixes (├/└) for items with child details
- Strikethrough for completed items
- Shows touches and all three checks (happy/negative/boundary)
- Displays plan file path link at the end

plan_approve uses compact single-line format like read_file:
- Shows approval status and revision number
- Handles already-approved and error cases

Changes:
- Add print_plan_compact() to UiWriter trait with default impl
- Implement print_plan_compact() in ConsoleUiWriter
- Call print_plan_compact() from execute_plan_read/write
- Add plan_read/plan_write to is_self_handled_tool()
- Add plan_approve to is_compact_tool() with format_plan_approve_summary()
- Add serde_yaml dependency to g3-cli
2026-02-02 15:30:05 +11:00
Dhanji R. Prasanna
a902be1562 Refactor system prompts to eliminate duplication; upgrade embedded provider
- Refactor prompts.rs: extract shared sections (intro, TODO, workspace memory,
  web research, response guidelines) used by both native and non-native prompts
- Fix typo in native prompt: "save them.." -> "save them."
- Fix non-native prompt: add missing closing braces in JSON examples,
  add IMPORTANT steps section, align with native prompt quality
- Add 9 unit tests to verify both prompts contain required sections
- Upgrade llama-cpp-2 dependency and refactor embedded provider
- Update config.example.toml with embedded model examples
- Update workspace memory
2026-01-28 09:56:39 +11:00
Dhanji R. Prasanna
dd3db0227d Add tab completion for commands and file paths
Implement tab completion in interactive mode using rustyline:

- Command completion: /<TAB> shows all commands, /com<TAB> -> /compact
- File path completion: /run <TAB> completes file/directory paths
- Supports tilde expansion for home directory

Architecture is extensible for future semantic completions:
- /resume <TAB> -> session IDs (Phase 2)
- /rehydrate <TAB> -> fragment IDs (Phase 2)

New module: completion.rs with G3Helper struct implementing
rustyline's Completer trait.
2026-01-20 10:57:33 +05:30
Dhanji R. Prasanna
4c6878a63d Set process title to agent name in agent mode
When running g3 --agent butler, the process title is now "g3 [butler]"
which shows up in ps, Activity Monitor, top, etc.

Uses the proctitle crate for cross-platform support.
2026-01-16 14:37:58 +05:30
Dhanji R. Prasanna
65e0217c68 Add unit tests for studio session management
New tests:
- test_new_session_has_short_id
- test_new_interactive_session
- test_branch_name_format
- test_session_save_and_load
- test_session_mark_complete
- test_session_mark_paused
- test_list_empty_sessions
- test_backwards_compatibility_no_session_type

Added tempfile as dev dependency for temp directory tests.
2026-01-16 06:52:23 +05:30
Dhanji R. Prasanna
151b8c4658 Add Racket tree-sitter support, remove Kotlin
- Add tree-sitter-racket dependency (v0.24)
- Initialize Racket parser in code search
- Add .rkt, .rktl, .rktd file extensions
- Add test_racket_search test
- Remove Kotlin from supported languages (was disabled)
- Clean up duplicate test files

Supported languages: Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Racket
2026-01-13 18:44:59 +05:30
Dhanji R. Prasanna
9a3b03a41f Remove flock mode (superseded by studio)
Flock mode has been superseded by the studio multi-agent workspace manager.

Changes:
- Remove g3-ensembles crate entirely
- Remove --project, --flock-workspace, --segments, --flock-max-turns CLI flags
- Remove run_flock_mode() from autonomous.rs
- Remove flock-related tests from cli_integration_test.rs
- Update README.md, docs/architecture.md, analysis/memory.md
- Delete docs/FLOCK_MODE.md
2026-01-13 15:01:12 +05:30
Dhanji R. Prasanna
f30f145c85 Fix UTF-8 panics and inconsistent retry logic
- Fix 7 UTF-8 byte slicing panics that crash on multi-byte characters:
  - acd.rs: extract_topic_from_text() [..50] slice
  - streaming.rs: log_stream_error() [..500] slice
  - tools/acd.rs: rehydrate message truncation [..2000] slice
  - history.rs: git commit message truncation [..69] slice
  - planner.rs: commit summary/description truncation [..69] slices
  - llm.rs: requirements summary line truncation [..117] slice

- All now use chars().count() and chars().take(N).collect() for
  UTF-8 safe truncation

- Fix inconsistent retry logic in task_execution.rs:
  - Previously only retried on Timeout errors
  - Now retries on ALL recoverable errors (rate limits, network,
    server errors, model busy, token limits, context length)
  - Added error-specific base delays (rate limit: 5s, server: 2s, etc.)
  - Added exponential backoff with ±20% jitter
  - Consistent with autonomous mode retry behavior
2026-01-13 05:49:45 +05:30
Dhanji R. Prasanna
30bb63715e Fix studio status to show full markdown-formatted summary
Changes:
- Fix JSON path for session logs: now reads from context_window.conversation_history
  (with fallback to messages for backwards compatibility)
- Remove 500-character truncation to show full summary
- Add termimad dependency for terminal markdown rendering
- Display summary with proper markdown formatting (headers, bold, code, lists)

The extract_session_summary() function was looking for messages at the wrong
JSON path. Session logs store conversation history at context_window.conversation_history,
not at the top-level messages key.
2026-01-12 10:13:58 +05:30
Dhanji R. Prasanna
6c17f269d7 Add studio tool for multi-agent workspace management
Studio enables running multiple g3 agents concurrently without conflicts
by using git worktrees for isolation.

Features:
- studio run --agent <name> [args...]: Create worktree, spawn g3, tail output
- studio list: Show all active sessions
- studio status <id>: Show session details and summary
- studio accept <id>: Merge session branch to main and cleanup
- studio discard <id>: Delete session without merging

Each session gets:
- Isolated worktree at .worktrees/sessions/<agent>/<session-id>
- Dedicated branch: sessions/<agent>/<session-id>
- Short UUID (8 chars) for easy reference
- Automatic --workspace and --agent flags passed to g3
2026-01-12 07:26:17 +05:30
Dhanji R. Prasanna
1090e30d6c Simplify system prompt: remove coding style and parallel tool call sections
- Remove IMPORTANT FOR CODING section (~1,500 chars of coding guidelines)
- Remove <use_parallel_tool_calls> block (~500 chars)
- Remove unused const_format dependency from g3-core
- Simplify get_system_prompt_for_native() to just return base prompt
- Response Guidelines now cleanly ends the static prompt

Prompt reduced from ~8,500 to ~6,500 characters.
2026-01-11 06:35:18 +08:00
Dhanji R. Prasanna
9bef7753bf Add Chrome headless diagnostic tool
Runs automatically when --chrome-headless flag is used, checking:
- ChromeDriver installation and PATH
- Chrome/Chromium installation
- Chrome and ChromeDriver version compatibility
- config.toml chrome_binary setting
- Chrome for Testing installation
- ChromeDriver executable permissions (macOS quarantine)

Displays a detailed report with:
- Summary of detected versions and paths
- Pass/warning/error status for each check
- Specific fix suggestions for any issues found

Users can then ask g3 to help fix any detected issues.
2026-01-10 20:44:23 +11:00
Dhanji R. Prasanna
347513b04c Add comprehensive stress tests for streaming markdown formatter
Add 10 stress tests covering:
- Nested formatting (bold in italic, italic in bold)
- Empty/minimal content edge cases
- Escape sequences and special characters
- Lists with complex inline formatting
- Links with various content types
- Tables with formatting in cells
- Code blocks (should not format contents)
- Mixed block elements (headers, quotes, rules)
- Nested lists (3+ levels, mixed types)
- Pathological/adversarial inputs (unbalanced delimiters, unicode, long lines)

All 45 tests pass.
2026-01-08 20:27:28 +11:00
Dhanji R. Prasanna
775bcd10a5 chore: remove g3-console crate entirely
The g3-console crate was not referenced by any other crate in the
workspace and appears to be an abandoned web console implementation.

Removed:
- crates/g3-console/ (entire directory)
- Workspace member entry in Cargo.toml

Agent: fowler
2026-01-07 10:41:46 +11:00
Dhanji R. Prasanna
3601cc0547 Enhance read_image tool with magic byte detection and multi-image support
- Fix media type detection using magic bytes instead of file extension
  - Correctly identifies JPEG files with .png extension (and vice versa)
  - Supports PNG, JPEG, GIF, and WebP formats

- Add multi-image support with file_paths array parameter
  - Load multiple images in a single tool call
  - All images queued for LLM analysis

- Enhanced CLI output:
  - Inline image preview via iTerm2 imgcat protocol (height=5)
  - Dimmed info line showing: path | dimensions | media type | file size
  - Proper │ prefix alignment with tool output boxing
  - Human-readable file sizes (bytes, KB, MB)

- Add image dimension extraction from file headers
  - PNG, JPEG, GIF, WebP dimension parsing

- Add comprehensive tests for magic byte detection and dimensions
2025-12-26 11:19:37 +11:00
Dhanji R. Prasanna
01a5284d6d Move fixed_filter_json from g3-core to g3-cli
Properly separates UI display concern from core library:
- fixed_filter_json module now lives in g3-cli (UI layer)
- UiWriter trait gains filter_json_tool_calls() and reset_json_filter() methods
- g3-core delegates filtering to UI layer via trait methods
- Different UiWriter implementations can choose their own filtering behavior
- ConsoleUiWriter filters JSON tool calls for clean terminal display
- MachineUiWriter/NullUiWriter use default pass-through

Benefits:
- Proper separation of concerns
- Core stays clean without display-specific logic
- Testability - filter can be tested independently in g3-cli
2025-12-22 10:32:21 +11:00
Jochen
ff8b3e7c7b Implement planning mode 2025-12-09 17:03:53 +11:00
Jochen
0327a6dfdf make sure coach feedback is extracted. 2025-12-02 22:00:58 +11:00
Jochen
52f78653b4 add context window monitor
Writes the current context window to logs/current_context_window (uses a symlink to a session ID).

This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.
2025-11-27 21:00:02 +11:00
Jochen
93dc4acf86 generate internal id (debugging only)
NOT set to provider... Anthropic will reject a message with id
2025-11-27 18:30:42 +11:00
Dhanji Prasanna
4cfa0147ca first cut of horizontal partitioning
# Conflicts:
#	Cargo.lock

# Conflicts:
#	Cargo.lock
#	crates/g3-cli/src/lib.rs
2025-11-26 17:12:07 +11:00
Jochen
1e1702001c Add logging for discovery 2025-11-26 10:41:35 +11:00
Jochen
ad198a8501 add code exploration fast start
This tries to short-circuit multiple round-trips to llm for reading code.
It's a precursor to trying to context engineer tailored to specific tasks.
In initial experiments, it's only marginally faster than regular mode, and burns more tokens.
2025-11-25 22:51:32 +11:00
Jochen
28a83d2dcf check for stale TODOs
on by default, can be disabled
2025-11-21 12:09:01 +11:00
Jochen
09dbad2d68 allow multiple tool calls, log warnings if there are duplicate calls.
controlled via a flag to the agent config:
allow_multiple_tool_calls = true
2025-11-21 10:49:15 +11:00
Jochen
7f73b664a3 system prompt now includes code style guide 2025-11-18 18:21:16 +11:00
Dhanji Prasanna
aaf918828f g3 console initial cut + error doesnt kill auto 2025-11-07 09:27:13 +11:00
Dhanji R. Prasanna
8eda691cb1 todo persistence 2025-11-06 15:24:57 +11:00
Dhanji R. Prasanna
53c8245942 fixes for scheme+haskell 2025-11-05 14:33:12 +11:00
Dhanji R. Prasanna
4327c839a9 added scheme and kotlin to code_search 2025-11-05 14:17:15 +11:00
Dhanji R. Prasanna
fa38439a06 adding more languages to tree-sitter (java, go, cpp,..) 2025-11-05 14:07:50 +11:00
Dhanji R. Prasanna
f25a3d5e06 tree-sitter replaces ast-grep 2025-11-05 13:56:23 +11:00
Jochen
ad9ba5e5d8 added ast-grep use
g3 tool use of ast-grep command with batching for faster code exploration.
2025-11-01 14:59:55 +11:00
Michael Neale
aa4a0267ea can interrupt now 2025-10-29 13:29:03 +11:00
Dhanji Prasanna
4b1694b308 machine mode 2025-10-28 14:51:32 +11:00
Dhanji Prasanna
61d748034d replace tesseract with apple vision 2025-10-24 15:35:47 +11:00
Dhanji Prasanna
3ec65e38ee macax tools 2025-10-23 06:53:42 +11:00
Jochen
010a43d203 coach/player provider split + add OpenAI
Allows coach and player LLM providers to be separately specified.
Also adds OpenAI provider
2025-10-21 16:59:13 +11:00
Dhanji Prasanna
393826ae02 webdriver tools 2025-10-21 14:34:41 +11:00
Dhanji Prasanna
9d35449be8 ~ expansion for read_file and str_replace 2025-10-18 16:01:15 +11:00
Dhanji Prasanna
da652bf287 computer control tools 2025-10-18 14:16:50 +11:00
Dhanji Prasanna
627fdcd9bf streaming tool call attempt 1 2025-10-13 20:25:12 +11:00
Dhanji Prasanna
e6cec5ef0f retry on errors 2025-10-07 11:20:19 +11:00
Dhanji Prasanna
9b7c228134 scroll hack 2025-10-04 15:05:06 +10:00
Dhanji Prasanna
57b1b51e65 retro mode ui! 2025-10-02 14:47:19 +10:00
Dhanji Prasanna
e324ddd99d hopefully a bit better tool call detection 2025-10-02 10:27:58 +10:00
Dhanji Prasanna
046b54c49b move embedded provider to a better crate 2025-10-01 15:19:37 +10:00
Dhanji Prasanna
1621d081ec tui lib for nicer cli 2025-10-01 11:19:34 +10:00