Commit Graph

22 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
01cb4f6691 fix: use consistent max_tokens defaults across providers
- Fix aliasing issue where resolve_max_tokens() used fallback_default_max_tokens
  (8192) instead of provider-specific defaults
- Update fallback_default_max_tokens from 8192 to 32000
- Set provider-specific max_tokens defaults:
  - Anthropic: 32000
  - OpenAI: 32000 (was 16000)
  - Databricks: 32000 (was 50000, now matches Anthropic as passthru)
  - Embedded: 2048
- Context window lengths unchanged:
  - OpenAI: 400,000
  - Anthropic: 200,000
  - Databricks (Claude): 200,000

This fixes the 'LLM response was cut off due to max_tokens limit' error
in agent mode that occurred because 8192 was being used instead of 32000.
2026-01-16 07:05:57 +05:30
Dhanji R. Prasanna
0ae1a13cdb feat: real-time tool call streaming indicator with blinking UI
- Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback
- New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active()
- Refactor ConsoleUiWriter state to use atomics in ParsingHintState
- Add tool_call_streaming field to CompletionChunk for provider hints
- Anthropic provider sends streaming hints when tool name detected
- New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active()

Parser improvements:
- Add is_json_invalidated() to detect false positive tool patterns
- Fix tool result poisoning when file contents contain partial JSON
- Unescaped newlines in strings or prose after JSON invalidates detection

User sees ' ● tool_name |' immediately when tool call starts streaming,
with blinking indicator while args are received.
2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna
e301075666 Fix panic on multi-byte chars in filter_json buffer truncation
The buffer truncation code was slicing at a raw byte offset which could
land in the middle of a multi-byte character (like emojis), causing a
panic. Fixed by using char_indices() to find valid character boundaries.

Also added stop_reason field to CompletionChunk initializers in tests
to complete the stop_reason feature addition.

- Fix byte boundary panic in filter_json.rs line 327
- Add test for multi-byte character handling
- Update test files with missing stop_reason field
2026-01-09 15:20:57 +11:00
Dhanji R. Prasanna
2bf475960c refactor: extract shared streaming utilities module
Agent: carmack

Create crates/g3-providers/src/streaming.rs with shared helpers:
- decode_utf8_streaming(): Handle incomplete UTF-8 sequences in SSE streams
- is_incomplete_json_error(): Detect incomplete vs malformed JSON
- make_final_chunk(): Create finished completion chunks
- make_text_chunk(): Create text content chunks
- make_tool_chunk(): Create tool call chunks

Refactor anthropic.rs:
- Use shared decode_utf8_streaming (removes 15 lines of inline UTF-8 handling)
- Use make_final_chunk, make_text_chunk, make_tool_chunk helpers
- Reduces verbose CompletionChunk constructions throughout

Refactor databricks.rs:
- Remove local copies of streaming helpers (now uses shared module)
- Reduces duplication between providers

Net reduction: 118 lines removed, 16 lines added (including new module)
All tests pass. Behavior unchanged.
2026-01-07 12:48:07 +11:00
Dhanji R. Prasanna
bb63050779 refactor: improve readability of streaming and file ops code
Agent: carmack

databricks.rs:
- Extract ToolCallAccumulator struct to replace opaque (String, String, String) tuple
- Add decode_utf8_streaming() helper for cleaner UTF-8 handling
- Add is_incomplete_json_error() helper for JSON parse error detection
- Add make_final_chunk() helper to reduce duplication
- Add finalize_tool_calls() to convert accumulators to final format
- Refactor parse_streaming_response from ~270 lines to ~100 lines
- Reduce nesting depth from 8+ levels to 4 levels
- Use early returns and let-else for cleaner control flow

file_ops.rs:
- Replace repetitive if-let chains with declarative PATH_CONTENT_KEYS table
- Use match expression instead of nested if-else
- Reduce extract_path_and_content from 44 lines to 20 lines

All tests pass. Behavior unchanged.
2026-01-07 12:39:05 +11:00
Dhanji R. Prasanna
923def0ab2 Convert all INFO logs to DEBUG to reduce CLI noise
Converted ~77 info! macro calls to debug! across the codebase to prevent
log messages from interrupting the CLI experience during normal operation.
Users can still see these logs by setting RUST_LOG=debug if needed.

Affected crates:
- g3-cli
- g3-computer-control
- g3-console
- g3-core
- g3-ensembles
- g3-execution
- g3-providers
2025-12-22 16:27:35 +11:00
Jochen
ff8b3e7c7b Implement planning mode 2025-12-09 17:03:53 +11:00
Jochen
4aa84e2144 disable thinking if there is no token budget 2025-12-09 16:45:28 +11:00
Jochen
52f78653b4 add context window monitor
Writes the current context window to logs/current_context_window (uses a symlink to a session ID).

This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.
2025-11-27 21:00:02 +11:00
Jochen
ad198a8501 add code exploration fast start
This tries to short-circuit multiple round-trips to llm for reading code.
It's a precursor to trying to context engineer tailored to specific tasks.
In initial experiments, it's only marginally faster than regular mode, and burns more tokens.
2025-11-25 22:51:32 +11:00
Jochen
9bffd8b1bf cache_control removed from databricks 2025-11-19 12:15:49 +11:00
Jochen
a150ba6a55 adds ttl to cache control 2025-11-18 23:23:49 +11:00
Jochen
296bf5a449 adds cache_control 2025-11-18 22:38:52 +11:00
Dhanji Prasanna
22a0090cdc fix unexpected EOF on streams 2025-11-04 16:28:41 +11:00
Dhanji Prasanna
d0ac222e2e more macax tooling 2025-10-24 10:45:24 +11:00
Dhanji Prasanna
c5d6fbef08 control commands 2025-10-22 22:14:12 +11:00
Dhanji Prasanna
bb90cc7826 some fixes 2025-10-14 12:44:02 +11:00
Dhanji Prasanna
062e6de63f fix for buffered messages at end, colorized context bars 2025-10-13 13:36:37 +11:00
Dhanji Prasanna
260c949576 token counting fixes 2025-10-09 12:11:21 +11:00
Dhanji Prasanna
bcba99ec6c auto refresh token 2025-10-04 17:32:48 +10:00
Dhanji Prasanna
4e64555008 max tokens fix for databricks 2025-09-29 06:45:53 +10:00
Dhanji Prasanna
c490228824 databricks support 2025-09-27 17:28:02 +10:00