alex/g3 - g3 - Millerson GIT hosting

alex/g3

Author	SHA1	Message	Date
Dhanji R. Prasanna	5b4079e861	Add prompt cache statistics tracking to /stats command - Extend Usage struct with cache_creation_tokens and cache_read_tokens fields - Parse Anthropic cache_creation_input_tokens and cache_read_input_tokens - Parse OpenAI prompt_tokens_details.cached_tokens for automatic prefix caching - Add CacheStats struct to Agent for cumulative tracking across API calls - Add "Prompt Cache Statistics" section to /stats output showing: - API call count and cache hit count - Hit rate percentage - Total input tokens and cache read/creation tokens - Cache efficiency (% of input served from cache) - Update all provider implementations and test files	2026-01-27 11:32:45 +11:00
Dhanji R. Prasanna	01cb4f6691	fix: use consistent max_tokens defaults across providers - Fix aliasing issue where resolve_max_tokens() used fallback_default_max_tokens (8192) instead of provider-specific defaults - Update fallback_default_max_tokens from 8192 to 32000 - Set provider-specific max_tokens defaults: - Anthropic: 32000 - OpenAI: 32000 (was 16000) - Databricks: 32000 (was 50000, now matches Anthropic as passthru) - Embedded: 2048 - Context window lengths unchanged: - OpenAI: 400,000 - Anthropic: 200,000 - Databricks (Claude): 200,000 This fixes the 'LLM response was cut off due to max_tokens limit' error in agent mode that occurred because 8192 was being used instead of 32000.	2026-01-16 07:05:57 +05:30
Dhanji R. Prasanna	0ae1a13cdb	feat: real-time tool call streaming indicator with blinking UI - Add ToolParsingHint enum (Detected/Active/Complete) for UI feedback - New UiWriter methods: print_tool_streaming_hint(), print_tool_streaming_active() - Refactor ConsoleUiWriter state to use atomics in ParsingHintState - Add tool_call_streaming field to CompletionChunk for provider hints - Anthropic provider sends streaming hints when tool name detected - New streaming helpers: make_tool_streaming_hint(), make_tool_streaming_active() Parser improvements: - Add is_json_invalidated() to detect false positive tool patterns - Fix tool result poisoning when file contents contain partial JSON - Unescaped newlines in strings or prose after JSON invalidates detection User sees ' ● tool_name \|' immediately when tool call starts streaming, with blinking indicator while args are received.	2026-01-15 13:49:29 +05:30
Dhanji R. Prasanna	e301075666	Fix panic on multi-byte chars in filter_json buffer truncation The buffer truncation code was slicing at a raw byte offset which could land in the middle of a multi-byte character (like emojis), causing a panic. Fixed by using char_indices() to find valid character boundaries. Also added stop_reason field to CompletionChunk initializers in tests to complete the stop_reason feature addition. - Fix byte boundary panic in filter_json.rs line 327 - Add test for multi-byte character handling - Update test files with missing stop_reason field	2026-01-09 15:20:57 +11:00
Dhanji R. Prasanna	2bf475960c	refactor: extract shared streaming utilities module Agent: carmack Create crates/g3-providers/src/streaming.rs with shared helpers: - decode_utf8_streaming(): Handle incomplete UTF-8 sequences in SSE streams - is_incomplete_json_error(): Detect incomplete vs malformed JSON - make_final_chunk(): Create finished completion chunks - make_text_chunk(): Create text content chunks - make_tool_chunk(): Create tool call chunks Refactor anthropic.rs: - Use shared decode_utf8_streaming (removes 15 lines of inline UTF-8 handling) - Use make_final_chunk, make_text_chunk, make_tool_chunk helpers - Reduces verbose CompletionChunk constructions throughout Refactor databricks.rs: - Remove local copies of streaming helpers (now uses shared module) - Reduces duplication between providers Net reduction: 118 lines removed, 16 lines added (including new module) All tests pass. Behavior unchanged.	2026-01-07 12:48:07 +11:00
Dhanji R. Prasanna	bb63050779	refactor: improve readability of streaming and file ops code Agent: carmack databricks.rs: - Extract ToolCallAccumulator struct to replace opaque (String, String, String) tuple - Add decode_utf8_streaming() helper for cleaner UTF-8 handling - Add is_incomplete_json_error() helper for JSON parse error detection - Add make_final_chunk() helper to reduce duplication - Add finalize_tool_calls() to convert accumulators to final format - Refactor parse_streaming_response from ~270 lines to ~100 lines - Reduce nesting depth from 8+ levels to 4 levels - Use early returns and let-else for cleaner control flow file_ops.rs: - Replace repetitive if-let chains with declarative PATH_CONTENT_KEYS table - Use match expression instead of nested if-else - Reduce extract_path_and_content from 44 lines to 20 lines All tests pass. Behavior unchanged.	2026-01-07 12:39:05 +11:00
Dhanji R. Prasanna	923def0ab2	Convert all INFO logs to DEBUG to reduce CLI noise Converted ~77 info! macro calls to debug! across the codebase to prevent log messages from interrupting the CLI experience during normal operation. Users can still see these logs by setting RUST_LOG=debug if needed. Affected crates: - g3-cli - g3-computer-control - g3-console - g3-core - g3-ensembles - g3-execution - g3-providers	2025-12-22 16:27:35 +11:00
Jochen	ff8b3e7c7b	Implement planning mode	2025-12-09 17:03:53 +11:00
Jochen	4aa84e2144	disable thinking if there is no token budget	2025-12-09 16:45:28 +11:00
Jochen	52f78653b4	add context window monitor Writes the current context window to logs/current_context_window (uses a symlink to a session ID). This PR was unfortunately generated by a different LLM and did a ton of superficial reformating, it's actually a fairly small and benign change, but I don't want to roll back everything. Hope that's ok.	2025-11-27 21:00:02 +11:00
Jochen	ad198a8501	add code exploration fast start This tries to short-circuit multiple round-trips to llm for reading code. It's a precursor to trying to context engineer tailored to specific tasks. In initial experiments, it's only marginally faster than regular mode, and burns more tokens.	2025-11-25 22:51:32 +11:00
Jochen	9bffd8b1bf	cache_control removed from databricks	2025-11-19 12:15:49 +11:00
Jochen	a150ba6a55	adds ttl to cache control	2025-11-18 23:23:49 +11:00
Jochen	296bf5a449	adds cache_control	2025-11-18 22:38:52 +11:00
Dhanji Prasanna	22a0090cdc	fix unexpected EOF on streams	2025-11-04 16:28:41 +11:00
Dhanji Prasanna	d0ac222e2e	more macax tooling	2025-10-24 10:45:24 +11:00
Dhanji Prasanna	c5d6fbef08	control commands	2025-10-22 22:14:12 +11:00
Dhanji Prasanna	bb90cc7826	some fixes	2025-10-14 12:44:02 +11:00
Dhanji Prasanna	062e6de63f	fix for buffered messages at end, colorized context bars	2025-10-13 13:36:37 +11:00
Dhanji Prasanna	260c949576	token counting fixes	2025-10-09 12:11:21 +11:00
Dhanji Prasanna	bcba99ec6c	auto refresh token	2025-10-04 17:32:48 +10:00
Dhanji Prasanna	4e64555008	max tokens fix for databricks	2025-09-29 06:45:53 +10:00
Dhanji Prasanna	c490228824	databricks support	2025-09-27 17:28:02 +10:00

23 Commits