use consistent naming for compaction

This commit is contained in:
Dhanji R. Prasanna
2026-01-08 12:54:03 +11:00
parent 3776ed847e
commit 5bfaee8dd5
13 changed files with 86 additions and 89 deletions

View File

@@ -14,7 +14,7 @@ The agent follows a **tool-first philosophy**: instead of just providing advice,
4. **Modularity**: Clear separation of concerns 4. **Modularity**: Clear separation of concerns
5. **Composability**: Components can be combined in different ways 5. **Composability**: Components can be combined in different ways
6. **Performance**: Built in Rust for speed and reliability 6. **Performance**: Built in Rust for speed and reliability
7. **Context Intelligence**: Smart context window management with auto-summarization 7. **Context Intelligence**: Smart context window management with auto-compaction
8. **Error Resilience**: Robust error handling with automatic retry logic 8. **Error Resilience**: Robust error handling with automatic retry logic
## Project Structure ## Project Structure
@@ -87,7 +87,7 @@ g3/
- Error handling with automatic retry logic - Error handling with automatic retry logic
**Key Features:** **Key Features:**
- **Context Window Intelligence**: Automatic monitoring with percentage-based tracking (80% capacity triggers auto-summarization) - **Context Window Intelligence**: Automatic monitoring with percentage-based tracking (80% capacity triggers auto-compaction)
- **Tool System**: Built-in tools for file operations (read, write, edit), shell commands, and structured output - **Tool System**: Built-in tools for file operations (read, write, edit), shell commands, and structured output
- **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution - **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution
- **Session Management**: Automatic session logging with detailed conversation history and token usage - **Session Management**: Automatic session logging with detailed conversation history and token usage
@@ -402,7 +402,7 @@ This design document reflects the current state of G3 as a mature, production-re
-**Configuration**: TOML-based config with environment overrides -**Configuration**: TOML-based config with environment overrides
-**Error Handling**: Comprehensive retry logic and error classification -**Error Handling**: Comprehensive retry logic and error classification
-**Session Logging**: Automatic session tracking and JSON logs -**Session Logging**: Automatic session tracking and JSON logs
-**Context Management**: Context thinning (50-80%) and auto-summarization at 80% capacity -**Context Management**: Context thinning (50-80%) and auto-compaction at 80% capacity
-**Computer Control**: Cross-platform automation with OCR support -**Computer Control**: Cross-platform automation with OCR support
-**TODO Management**: In-memory TODO list with read/write tools -**TODO Management**: In-memory TODO list with read/write tools

View File

@@ -11,7 +11,7 @@ G3 follows a modular architecture organized as a Rust workspace with multiple cr
#### **g3-core** #### **g3-core**
The heart of the agent system, containing: The heart of the agent system, containing:
- **Agent Engine**: Main orchestration logic for handling conversations, tool execution, and task management - **Agent Engine**: Main orchestration logic for handling conversations, tool execution, and task management
- **Context Window Management**: Intelligent tracking of token usage with context thinning (50-80%) and auto-summarization at 80% capacity - **Context Window Management**: Intelligent tracking of token usage with context thinning (50-80%) and auto-compaction at 80% capacity
- **Tool System**: Built-in tools for file operations, shell commands, computer control, TODO management, and structured output - **Tool System**: Built-in tools for file operations, shell commands, computer control, TODO management, and structured output
- **Streaming Response Parser**: Real-time parsing of LLM responses with tool call detection and execution - **Streaming Response Parser**: Real-time parsing of LLM responses with tool call detection and execution
- **Task Execution**: Support for single and iterative task execution with automatic retry logic - **Task Execution**: Support for single and iterative task execution with automatic retry logic
@@ -80,14 +80,14 @@ After each response, G3 displays a timing footer showing elapsed time, time to f
### Intelligent Context Management ### Intelligent Context Management
- Automatic context window monitoring with percentage-based tracking - Automatic context window monitoring with percentage-based tracking
- Smart auto-summarization when approaching token limits - Smart auto-compaction when approaching token limits
- **Context thinning** at 50%, 60%, 70%, 80% thresholds - automatically replaces large tool results with file references - **Context thinning** at 50%, 60%, 70%, 80% thresholds - automatically replaces large tool results with file references
- Conversation history preservation through summaries - Conversation history preservation through summaries
- Dynamic token allocation for different providers (4k to 200k+ tokens) - Dynamic token allocation for different providers (4k to 200k+ tokens)
### Interactive Control Commands ### Interactive Control Commands
G3's interactive CLI includes control commands for manual context management: G3's interactive CLI includes control commands for manual context management:
- **`/compact`**: Manually trigger summarization to compact conversation history - **`/compact`**: Manually trigger compaction to compact conversation history
- **`/thinnify`**: Manually trigger context thinning to replace large tool results with file references - **`/thinnify`**: Manually trigger context thinning to replace large tool results with file references
- **`/skinnify`**: Manually trigger full context thinning (like `/thinnify` but processes the entire context window, not just the first third) - **`/skinnify`**: Manually trigger full context thinning (like `/thinnify` but processes the entire context window, not just the first third)
- **`/readme`**: Reload README.md and AGENTS.md from disk without restarting - **`/readme`**: Reload README.md and AGENTS.md from disk without restarting

View File

@@ -3,7 +3,7 @@ SYSTEM PROMPT — “Carmack” (In-Code Readability & Craft Agent)
You are Carmack: a code-aware readability agent, inspired by John Carmack. You are Carmack: a code-aware readability agent, inspired by John Carmack.
You work **inside source code files only — ever.** You work **inside source code files only — ever.**
Your job is to make complex logic understandable to humans and code a joy to read. Your job is to simplify, make code easy to understand, and a joy to read.
------------------------------------------------------------ ------------------------------------------------------------
PRIME DIRECTIVE PRIME DIRECTIVE
@@ -18,7 +18,7 @@ PRIME DIRECTIVE
- Non-negotiable nudge: - Non-negotiable nudge:
**Readable code > commented code.** **Readable code > commented code.**
You remain disciplined inside the source. Do NOT touch docs, READMEs, etc. Stay inside the source. Do NOT touch docs, READMEs, etc.
------------------------------------------------------------ ------------------------------------------------------------
ALLOWED ACTIVITIES ALLOWED ACTIVITIES
@@ -26,16 +26,14 @@ ALLOWED ACTIVITIES
LOCAL REFACTORS (behavior-preserving, BUT aggressively readability improving): LOCAL REFACTORS (behavior-preserving, BUT aggressively readability improving):
- Rename private functions/variables for legibility - Rename private functions/variables for legibility
- Extract overly long functions into smaller helpers
- Simplify nested conditionals
- Clarify data shapes and invariants
- Replace clever tricks with plain constructs
- Improve existing explanations
- Pull out constants, interfaces, structs for readability - Pull out constants, interfaces, structs for readability
- Simplify nested control flow and conditionals
- Return well-defined structs over tuples/vectors
- Extract overly long functions and files into smaller helpers/components
- If files are larger than 1000 lines, refactor them into smaller pieces - If files are larger than 1000 lines, refactor them into smaller pieces
- If functions are longer than 250 lines refactor them - If functions are longer than 250 lines refactor them
EXPLANATION (only when needed): ADD EXPLANATIONS (when needed):
- Describe non-obvious algorithms in a short header comment sketch - Describe non-obvious algorithms in a short header comment sketch
- Explain macros, protocols, serializers, hotspot systems, briefly - Explain macros, protocols, serializers, hotspot systems, briefly
@@ -48,9 +46,9 @@ EXPLICIT BANS
You MUST NOT: You MUST NOT:
- Modify system architecture or layering - Modify system architecture
- Change public APIs, CLI flags, or file formats - Change public APIs, CLI flags, or file formats
- Add per-line explanatory comments to **obvious** code - Add explanatory comments to **obvious** code
- Introduce mocks or new libraries - Introduce mocks or new libraries
------------------------------------------------------------ ------------------------------------------------------------
@@ -61,9 +59,8 @@ Your output is successful if:
- the code is pure joy to read for a skilled programmer - the code is pure joy to read for a skilled programmer
- Humans can understand complex regions faster - Humans can understand complex regions faster
- A correct file becomes more pleasant to modify - A correct file becomes more pleasant to modify
- Control flow straightens - Files get smaller, more modular, composable, easy to trace
- Behavior is unchanged - Behavior is unchanged
- No architecture or external docs were touched
------------------------------------------------------------ ------------------------------------------------------------
CARMACK PREFLIGHT CHECKLIST CARMACK PREFLIGHT CHECKLIST

View File

@@ -1666,7 +1666,7 @@ async fn run_interactive<W: UiWriter>(
"/help" => { "/help" => {
output.print(""); output.print("");
output.print("📖 Control Commands:"); output.print("📖 Control Commands:");
output.print(" /compact - Trigger auto-summarization (compacts conversation history)"); output.print(" /compact - Trigger compaction (compacts conversation history)");
output.print(" /thinnify - Trigger context thinning (replaces large tool results with file references)"); output.print(" /thinnify - Trigger context thinning (replaces large tool results with file references)");
output.print(" /skinnify - Trigger full context thinning (like /thinnify but for entire context, not just first third)"); output.print(" /skinnify - Trigger full context thinning (like /thinnify but for entire context, not just first third)");
output.print(" /clear - Clear session and start fresh (discards continuation artifacts)"); output.print(" /clear - Clear session and start fresh (discards continuation artifacts)");
@@ -1680,17 +1680,17 @@ async fn run_interactive<W: UiWriter>(
continue; continue;
} }
"/compact" => { "/compact" => {
output.print("🗜️ Triggering manual summarization..."); output.print("🗜️ Triggering manual compaction...");
match agent.force_summarize().await { match agent.force_compact().await {
Ok(true) => { Ok(true) => {
output.print("Summarization completed successfully"); output.print("Compaction completed successfully");
} }
Ok(false) => { Ok(false) => {
output.print("⚠️ Summarization failed"); output.print("⚠️ Compaction failed");
} }
Err(e) => { Err(e) => {
output.print(&format!( output.print(&format!(
"❌ Error during summarization: {}", "❌ Error during compaction: {}",
e e
)); ));
} }
@@ -1909,9 +1909,9 @@ async fn run_interactive_machine(
match input.as_str() { match input.as_str() {
"/compact" => { "/compact" => {
println!("COMMAND: compact"); println!("COMMAND: compact");
match agent.force_summarize().await { match agent.force_compact().await {
Ok(true) => println!("RESULT: Summarization completed"), Ok(true) => println!("RESULT: Compaction completed"),
Ok(false) => println!("RESULT: Summarization failed"), Ok(false) => println!("RESULT: Compaction failed"),
Err(e) => println!("ERROR: {}", e), Err(e) => println!("ERROR: {}", e),
} }
continue; continue;

View File

@@ -321,7 +321,7 @@ impl UiWriter for ConsoleUiWriter {
fn print_final_output(&self, summary: &str) { fn print_final_output(&self, summary: &str) {
// Show spinner while "formatting" // Show spinner while "formatting"
let spinner_frames = ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏']; let spinner_frames = ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'];
let message = "summarizing work done..."; let message = "compacting work done...";
// Brief spinner animation (about 0.5 seconds) // Brief spinner animation (about 0.5 seconds)
for i in 0..5 { for i in 0..5 {

View File

@@ -183,8 +183,8 @@ impl ContextWindow {
self.total_tokens.saturating_sub(self.used_tokens) self.total_tokens.saturating_sub(self.used_tokens)
} }
/// Check if we should trigger summarization (at 80% capacity) /// Check if we should trigger compaction (at 80% capacity)
pub fn should_summarize(&self) -> bool { pub fn should_compact(&self) -> bool {
// Trigger at 80% OR if we're getting close to absolute limits // Trigger at 80% OR if we're getting close to absolute limits
// This prevents issues with models that have large contexts but still hit limits // This prevents issues with models that have large contexts but still hit limits
let percentage_trigger = self.percentage_used() >= 80.0; let percentage_trigger = self.percentage_used() >= 80.0;
@@ -744,19 +744,19 @@ mod tests {
} }
#[test] #[test]
fn test_should_summarize_at_80_percent() { fn test_should_compact_at_80_percent() {
let mut cw = ContextWindow::new(100); let mut cw = ContextWindow::new(100);
cw.used_tokens = 79; cw.used_tokens = 79;
assert!(!cw.should_summarize()); assert!(!cw.should_compact());
cw.used_tokens = 80; cw.used_tokens = 80;
assert!(cw.should_summarize()); assert!(cw.should_compact());
} }
#[test] #[test]
fn test_should_summarize_at_absolute_limit() { fn test_should_compact_at_absolute_limit() {
let mut cw = ContextWindow::new(1_000_000); let mut cw = ContextWindow::new(1_000_000);
cw.used_tokens = 150_001; cw.used_tokens = 150_001;
assert!(cw.should_summarize()); assert!(cw.should_compact());
} }
#[test] #[test]

View File

@@ -181,7 +181,7 @@ pub enum RecoverableError {
ModelBusy, ModelBusy,
/// Timeout /// Timeout
Timeout, Timeout,
/// Token limit exceeded (might be recoverable with summarization) /// Token limit exceeded (might be recoverable with compaction)
TokenLimit, TokenLimit,
/// Context length exceeded (prompt too long) - should end current turn in autonomous mode /// Context length exceeded (prompt too long) - should end current turn in autonomous mode
ContextLengthExceeded, ContextLengthExceeded,
@@ -357,7 +357,7 @@ where
// Special handling for token limit errors // Special handling for token limit errors
if matches!(recoverable_type, RecoverableError::TokenLimit) { if matches!(recoverable_type, RecoverableError::TokenLimit) {
debug!("Token limit error detected. Consider triggering summarization."); debug!("Token limit error detected. Consider triggering compaction.");
} }
tokio::time::sleep(delay).await; tokio::time::sleep(delay).await;

View File

@@ -92,9 +92,9 @@ pub struct Agent<W: UiWriter> {
providers: ProviderRegistry, providers: ProviderRegistry,
context_window: ContextWindow, context_window: ContextWindow,
thinning_events: Vec<usize>, // chars saved per thinning event thinning_events: Vec<usize>, // chars saved per thinning event
pending_90_summarization: bool, // flag to trigger summarization at 90% pending_90_compaction: bool, // flag to trigger compaction at 90%
auto_compact: bool, // whether to auto-compact at 90% before tool calls auto_compact: bool, // whether to auto-compact at 90% before tool calls
summarization_events: Vec<usize>, // chars saved per summarization event compaction_events: Vec<usize>, // chars saved per compaction event
first_token_times: Vec<Duration>, // time to first token for each completion first_token_times: Vec<Duration>, // time to first token for each completion
config: Config, config: Config,
session_id: Option<String>, session_id: Option<String>,
@@ -267,9 +267,9 @@ impl<W: UiWriter> Agent<W> {
providers, providers,
context_window, context_window,
auto_compact: config.agent.auto_compact, auto_compact: config.agent.auto_compact,
pending_90_summarization: false, pending_90_compaction: false,
thinning_events: Vec::new(), thinning_events: Vec::new(),
summarization_events: Vec::new(), compaction_events: Vec::new(),
first_token_times: Vec::new(), first_token_times: Vec::new(),
config, config,
session_id: None, session_id: None,
@@ -856,15 +856,15 @@ impl<W: UiWriter> Agent<W> {
self.save_context_window("completed"); self.save_context_window("completed");
// Check if we need to do 90% auto-compaction // Check if we need to do 90% auto-compaction
if self.pending_90_summarization { if self.pending_90_compaction {
self.ui_writer self.ui_writer
.print_context_status("\n⚡ Context window reached 90% - auto-compacting...\n"); .print_context_status("\n⚡ Context window reached 90% - auto-compacting...\n");
if let Err(e) = self.force_summarize().await { if let Err(e) = self.force_compact().await {
warn!("Failed to auto-compact at 90%: {}", e); warn!("Failed to auto-compact at 90%: {}", e);
} else { } else {
self.ui_writer.println(""); self.ui_writer.println("");
} }
self.pending_90_summarization = false; self.pending_90_compaction = false;
} }
// Return the task result which already includes timing if needed // Return the task result which already includes timing if needed
@@ -940,13 +940,13 @@ impl<W: UiWriter> Agent<W> {
} }
} }
/// Manually trigger context summarization regardless of context window size /// Manually trigger context compaction regardless of context window size
/// Returns Ok(true) if summarization was successful, Ok(false) if it failed /// Returns Ok(true) if compaction was successful, Ok(false) if it failed
pub async fn force_summarize(&mut self) -> Result<bool> { pub async fn force_compact(&mut self) -> Result<bool> {
debug!("Manual summarization triggered"); debug!("Manual compaction triggered");
self.ui_writer.print_context_status(&format!( self.ui_writer.print_context_status(&format!(
"\n🗜️ Manual summarization requested (current usage: {}%)...", "\n🗜️ Manual compaction requested (current usage: {}%)...",
self.context_window.percentage_used() as u32 self.context_window.percentage_used() as u32
)); ));
@@ -1048,7 +1048,7 @@ impl<W: UiWriter> Agent<W> {
let chars_saved = self let chars_saved = self
.context_window .context_window
.reset_with_summary(summary_response.content, latest_user_msg); .reset_with_summary(summary_response.content, latest_user_msg);
self.summarization_events.push(chars_saved); self.compaction_events.push(chars_saved);
Ok(true) Ok(true)
} }
@@ -1238,17 +1238,17 @@ impl<W: UiWriter> Agent<W> {
} }
stats.push_str(&format!( stats.push_str(&format!(
"Summarizations: {:>10}\n", "Compactions: {:>10}\n",
self.summarization_events.len() self.compaction_events.len()
)); ));
if !self.summarization_events.is_empty() { if !self.compaction_events.is_empty() {
let total_summarized: usize = self.summarization_events.iter().sum(); let total_compacted: usize = self.compaction_events.iter().sum();
let avg_summarized = total_summarized / self.summarization_events.len(); let avg_compacted = total_compacted / self.compaction_events.len();
stats.push_str(&format!( stats.push_str(&format!(
" • Total Chars Saved: {:>10}\n", " • Total Chars Saved: {:>10}\n",
total_summarized total_compacted
)); ));
stats.push_str(&format!(" • Avg Chars/Event: {:>10}\n", avg_summarized)); stats.push_str(&format!(" • Avg Chars/Event: {:>10}\n", avg_compacted));
} }
stats.push('\n'); stats.push('\n');
@@ -1604,9 +1604,9 @@ impl<W: UiWriter> Agent<W> {
// Note: Session-level duplicate tracking was removed - we only prevent sequential duplicates (DUP IN CHUNK, DUP IN MSG) // Note: Session-level duplicate tracking was removed - we only prevent sequential duplicates (DUP IN CHUNK, DUP IN MSG)
let mut turn_accumulated_usage: Option<g3_providers::Usage> = None; // Track token usage for timing footer let mut turn_accumulated_usage: Option<g3_providers::Usage> = None; // Track token usage for timing footer
// Check if we need to summarize before starting // Check if we need to compact before starting
if self.context_window.should_summarize() { if self.context_window.should_compact() {
// First try thinning if we are at capacity, don't call the LLM for a summary (might fail) // First try thinning if we are at capacity, don't call the LLM for compaction (might fail)
if self.context_window.percentage_used() > 90.0 && self.context_window.should_thin() { if self.context_window.percentage_used() > 90.0 && self.context_window.should_thin() {
self.ui_writer.print_context_status(&format!( self.ui_writer.print_context_status(&format!(
"\n🥒 Context window at {}%. Trying thinning first...", "\n🥒 Context window at {}%. Trying thinning first...",
@@ -1617,23 +1617,23 @@ impl<W: UiWriter> Agent<W> {
self.ui_writer.print_context_thinning(&thin_summary); self.ui_writer.print_context_thinning(&thin_summary);
// Check if thinning was sufficient // Check if thinning was sufficient
if !self.context_window.should_summarize() { if !self.context_window.should_compact() {
self.ui_writer.print_context_status( self.ui_writer.print_context_status(
"✅ Thinning resolved capacity issue. Continuing...\n", "✅ Thinning resolved capacity issue. Continuing...\n",
); );
// Continue with the original request without summarization // Continue with the original request without compaction
} else { } else {
self.ui_writer.print_context_status( self.ui_writer.print_context_status(
"⚠️ Thinning insufficient. Proceeding with summarization...\n", "⚠️ Thinning insufficient. Proceeding with compaction...\n",
); );
} }
} }
// Only proceed with summarization if still needed after thinning // Only proceed with compaction if still needed after thinning
if self.context_window.should_summarize() { if self.context_window.should_compact() {
// Notify user about summarization // Notify user about compaction
self.ui_writer.print_context_status(&format!( self.ui_writer.print_context_status(&format!(
"\n🗜️ Context window reaching capacity ({}%). Creating summary...", "\n🗜️ Context window reaching capacity ({}%). Compacting...",
self.context_window.percentage_used() as u32 self.context_window.percentage_used() as u32
)); ));
@@ -1735,17 +1735,17 @@ impl<W: UiWriter> Agent<W> {
let chars_saved = self let chars_saved = self
.context_window .context_window
.reset_with_summary(summary_response.content, latest_user_msg); .reset_with_summary(summary_response.content, latest_user_msg);
self.summarization_events.push(chars_saved); self.compaction_events.push(chars_saved);
// Update the request with new context // Update the request with new context
request.messages = self.context_window.conversation_history.clone(); request.messages = self.context_window.conversation_history.clone();
} }
Err(e) => { Err(e) => {
error!("Failed to create summary: {}", e); error!("Failed to create summary: {}", e);
self.ui_writer.print_context_status("⚠️ Unable to create summary. Consider starting a new session if you continue to see errors.\n"); self.ui_writer.print_context_status("⚠️ Unable to compact context. Consider starting a new session if you continue to see errors.\n");
// Don't continue with the original request if summarization failed // Don't continue with the original request if compaction failed
// as we're likely at token limit // as we're likely at token limit
return Err(anyhow::anyhow!("Context window at capacity and summarization failed. Please start a new session.")); return Err(anyhow::anyhow!("Context window at capacity and compaction failed. Please start a new session."));
} }
} }
} }
@@ -1963,9 +1963,9 @@ impl<W: UiWriter> Agent<W> {
// Check if we should auto-compact at 90% BEFORE executing the tool // Check if we should auto-compact at 90% BEFORE executing the tool
// We need to do this before any borrows of self // We need to do this before any borrows of self
if self.auto_compact && self.context_window.percentage_used() >= 90.0 { if self.auto_compact && self.context_window.percentage_used() >= 90.0 {
// Set flag to trigger summarization after this turn completes // Set flag to trigger compaction after this turn completes
// We can't do it now due to borrow checker constraints // We can't do it now due to borrow checker constraints
self.pending_90_summarization = true; self.pending_90_compaction = true;
} }
// Check if we should thin the context BEFORE executing the tool // Check if we should thin the context BEFORE executing the tool

View File

@@ -2,7 +2,7 @@ use g3_core::ContextWindow;
use g3_providers::{Message, MessageRole, Usage}; use g3_providers::{Message, MessageRole, Usage};
/// Test that used_tokens is tracked via add_message, not update_usage_from_response. /// Test that used_tokens is tracked via add_message, not update_usage_from_response.
/// This is critical for the 80% summarization threshold to work correctly. /// This is critical for the 80% compaction threshold to work correctly.
#[test] #[test]
fn test_used_tokens_tracked_via_messages() { fn test_used_tokens_tracked_via_messages() {
let mut window = ContextWindow::new(10000); let mut window = ContextWindow::new(10000);
@@ -106,10 +106,10 @@ fn test_percentage_based_on_used_tokens() {
assert!(window.remaining_tokens() < 1000, "remaining tokens should decrease"); assert!(window.remaining_tokens() < 1000, "remaining tokens should decrease");
} }
/// Test that the 80% summarization threshold works correctly. /// Test that the 80% compaction threshold works correctly.
/// This was the original bug - used_tokens was being double/triple counted. /// This was the original bug - used_tokens was being double/triple counted.
#[test] #[test]
fn test_should_summarize_threshold() { fn test_should_compact_threshold() {
let mut window = ContextWindow::new(1000); let mut window = ContextWindow::new(1000);
// Add messages until we approach 80% // Add messages until we approach 80%
@@ -131,9 +131,9 @@ fn test_should_summarize_threshold() {
let percentage_after = window.percentage_used(); let percentage_after = window.percentage_used();
println!("After 10 messages: {}% used ({} tokens)", percentage_after, window.used_tokens); println!("After 10 messages: {}% used ({} tokens)", percentage_after, window.used_tokens);
// Now should_summarize should return true if we're at 80%+ // Now should_compact should return true if we're at 80%+
if percentage_after >= 80.0 { if percentage_after >= 80.0 {
assert!(window.should_summarize(), "should_summarize should be true at 80%+"); assert!(window.should_compact(), "should_compact should be true at 80%+");
} }
} }

View File

@@ -11,7 +11,7 @@ Control commands are special commands you can use during an interactive G3 sessi
| Command | Description | | Command | Description |
|---------|-------------| |---------|-------------|
| `/compact` | Manually trigger conversation summarization | | `/compact` | Manually trigger conversation compaction |
| `/thinnify` | Replace large tool results with file references (first third) | | `/thinnify` | Replace large tool results with file references (first third) |
| `/skinnify` | Full context thinning (entire context window) | | `/skinnify` | Full context thinning (entire context window) |
| `/readme` | Reload README.md and AGENTS.md from disk | | `/readme` | Reload README.md and AGENTS.md from disk |
@@ -22,7 +22,7 @@ Control commands are special commands you can use during an interactive G3 sessi
## /compact ## /compact
Manually trigger conversation summarization to reduce context size. Manually trigger conversation compaction to reduce context size.
**When to use**: **When to use**:
- Context usage is getting high (70%+) - Context usage is getting high (70%+)
@@ -30,7 +30,7 @@ Manually trigger conversation summarization to reduce context size.
- Conversation has accumulated irrelevant history - Conversation has accumulated irrelevant history
**What it does**: **What it does**:
1. Sends conversation history to LLM for summarization 1. Sends conversation history to LLM for compaction
2. Replaces detailed history with concise summary 2. Replaces detailed history with concise summary
3. Preserves key decisions and context 3. Preserves key decisions and context
4. Significantly reduces token usage 4. Significantly reduces token usage
@@ -144,7 +144,7 @@ Show detailed context and performance statistics.
- Session duration - Session duration
- Token usage breakdown - Token usage breakdown
- Tool call metrics - Tool call metrics
- Thinning and summarization events - Thinning and compaction events
- First-token latency statistics - First-token latency statistics
**Example**: **Example**:
@@ -198,7 +198,7 @@ When context gets high:
1. **50-70%**: Consider `/thinnify` 1. **50-70%**: Consider `/thinnify`
2. **70-80%**: Use `/compact` 2. **70-80%**: Use `/compact`
3. **80-90%**: Use `/skinnify` then `/compact` 3. **80-90%**: Use `/skinnify` then `/compact`
4. **90%+**: Auto-summarization triggers 4. **90%+**: Auto-compaction triggers
### Best Practices ### Best Practices
@@ -218,7 +218,7 @@ G3 performs automatic context management:
| 50% | Thin oldest third of context | | 50% | Thin oldest third of context |
| 60% | Thin oldest third of context | | 60% | Thin oldest third of context |
| 70% | Thin oldest third of context | | 70% | Thin oldest third of context |
| 80% | Auto-summarization (if `auto_compact = true`) | | 80% | Auto-compaction (if `auto_compact = true`) |
| 90% | Aggressive thinning before tool calls | | 90% | Aggressive thinning before tool calls |
Manual commands give you finer control over when and how this happens. Manual commands give you finer control over when and how this happens.

View File

@@ -289,7 +289,7 @@ The `ContextWindow` struct manages conversation history with intelligent token t
1. **Token Tracking**: Monitors usage as percentage of provider's context limit 1. **Token Tracking**: Monitors usage as percentage of provider's context limit
2. **Context Thinning**: At 50%, 60%, 70%, 80% thresholds, replaces large tool results with file references 2. **Context Thinning**: At 50%, 60%, 70%, 80% thresholds, replaces large tool results with file references
3. **Auto-Summarization**: At 80% capacity, triggers conversation summarization 3. **Auto-Compaction**: At 80% capacity, triggers conversation compaction
4. **Provider Adaptation**: Adjusts to different model context windows (4k to 200k+ tokens) 4. **Provider Adaptation**: Adjusts to different model context windows (4k to 200k+ tokens)
## Error Handling ## Error Handling

View File

@@ -376,5 +376,5 @@ For Databricks OAuth:
If you see context overflow errors: If you see context overflow errors:
1. Check `max_context_length` in `[agent]` 1. Check `max_context_length` in `[agent]`
2. Use `/compact` command to manually summarize 2. Use `/compact` command to manually compact
3. Use `/thinnify` to replace large tool results with file references 3. Use `/thinnify` to replace large tool results with file references

View File

@@ -386,7 +386,7 @@ To reduce rate limit issues:
### Context Window Errors ### Context Window Errors
If you see "context too long" errors: If you see "context too long" errors:
1. Use `/compact` to summarize conversation 1. Use `/compact` to compact conversation
2. Use `/thinnify` to replace large tool results 2. Use `/thinnify` to replace large tool results
3. Increase `max_context_length` in config 3. Increase `max_context_length` in config
4. Switch to a provider with larger context 4. Switch to a provider with larger context