some fixes
This commit is contained in:
66
DESIGN.md
66
DESIGN.md
@@ -1,4 +1,4 @@
|
|||||||
# G3 General Purpose AI Agent - Design Document
|
# G3 - AI Coding Agent - Design Document
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
@@ -8,7 +8,7 @@ The agent follows a **tool-first philosophy**: instead of just providing advice,
|
|||||||
|
|
||||||
## Core Principles
|
## Core Principles
|
||||||
|
|
||||||
1. **Tool-First Philosophy**: Solve problems by actively using tools rather than just describing solutions
|
1. **Tool-First Philosophy**: Solve problems by actively using tools rather than just providing advice
|
||||||
2. **Modular Architecture**: Clear separation of concerns across multiple Rust crates
|
2. **Modular Architecture**: Clear separation of concerns across multiple Rust crates
|
||||||
3. **Provider Flexibility**: Support multiple LLM providers through a unified interface
|
3. **Provider Flexibility**: Support multiple LLM providers through a unified interface
|
||||||
4. **Modularity**: Clear separation of concerns
|
4. **Modularity**: Clear separation of concerns
|
||||||
@@ -23,11 +23,11 @@ G3 is organized as a Rust workspace with the following crates:
|
|||||||
|
|
||||||
```
|
```
|
||||||
g3/
|
g3/
|
||||||
├── src/main.rs # Main entry point
|
├── src/main.rs # Main entry point (delegates to g3-cli)
|
||||||
├── crates/
|
├── crates/
|
||||||
│ ├── g3-cli/ # Command-line interface and TUI
|
│ ├── g3-cli/ # Command-line interface, TUI, and retro mode
|
||||||
│ ├── g3-core/ # Core agent engine and logic
|
│ ├── g3-core/ # Core agent engine, tools, and streaming logic
|
||||||
│ ├── g3-providers/ # LLM provider abstractions
|
│ ├── g3-providers/ # LLM provider abstractions and implementations
|
||||||
│ ├── g3-config/ # Configuration management
|
│ ├── g3-config/ # Configuration management
|
||||||
│ └── g3-execution/ # Code execution engine
|
│ └── g3-execution/ # Code execution engine
|
||||||
├── logs/ # Session logs (auto-created)
|
├── logs/ # Session logs (auto-created)
|
||||||
@@ -74,7 +74,7 @@ g3/
|
|||||||
- Error handling with automatic retry logic
|
- Error handling with automatic retry logic
|
||||||
|
|
||||||
**Key Features:**
|
**Key Features:**
|
||||||
- **Context Window Intelligence**: Automatic monitoring with percentage-based tracking (~80% capacity triggers auto-summarization)
|
- **Context Window Intelligence**: Automatic monitoring with percentage-based tracking (80% capacity triggers auto-summarization)
|
||||||
- **Tool System**: Built-in tools for file operations (read, write, edit), shell commands, and structured output
|
- **Tool System**: Built-in tools for file operations (read, write, edit), shell commands, and structured output
|
||||||
- **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution
|
- **Streaming Parser**: Real-time parsing of LLM responses with tool call detection and execution
|
||||||
- **Session Management**: Automatic session logging with detailed conversation history and token usage
|
- **Session Management**: Automatic session logging with detailed conversation history and token usage
|
||||||
@@ -86,6 +86,7 @@ g3/
|
|||||||
- `write_file`: Create or overwrite files with content
|
- `write_file`: Create or overwrite files with content
|
||||||
- `str_replace`: Apply unified diffs to files with precise editing
|
- `str_replace`: Apply unified diffs to files with precise editing
|
||||||
- `final_output`: Signal task completion with detailed summaries
|
- `final_output`: Signal task completion with detailed summaries
|
||||||
|
- **Project Management**: Workspace handling, requirements.md processing for autonomous mode
|
||||||
|
|
||||||
### 2. g3-providers: LLM Provider Abstraction
|
### 2. g3-providers: LLM Provider Abstraction
|
||||||
|
|
||||||
@@ -97,7 +98,7 @@ g3/
|
|||||||
|
|
||||||
**Supported Providers:**
|
**Supported Providers:**
|
||||||
- **Anthropic**: Claude models via API with native tool calling support
|
- **Anthropic**: Claude models via API with native tool calling support
|
||||||
- **Databricks**: Foundation Model APIs with OAuth and token-based authentication
|
- **Databricks**: Foundation Model APIs with OAuth and token-based authentication (default provider)
|
||||||
- **Embedded**: Local models via llama.cpp with GPU acceleration (Metal/CUDA)
|
- **Embedded**: Local models via llama.cpp with GPU acceleration (Metal/CUDA)
|
||||||
- **Provider Registry**: Dynamic provider management and hot-swapping
|
- **Provider Registry**: Dynamic provider management and hot-swapping
|
||||||
|
|
||||||
@@ -119,7 +120,7 @@ g3/
|
|||||||
|
|
||||||
**Execution Modes:**
|
**Execution Modes:**
|
||||||
- **Single-shot**: Execute one task and exit
|
- **Single-shot**: Execute one task and exit
|
||||||
- **Interactive**: REPL-style conversation with the agent
|
- **Interactive**: REPL-style conversation with the agent (default mode)
|
||||||
- **Autonomous**: Coach-player feedback loop for complex projects
|
- **Autonomous**: Coach-player feedback loop for complex projects
|
||||||
- **Retro TUI**: Full-screen terminal interface with real-time updates
|
- **Retro TUI**: Full-screen terminal interface with real-time updates
|
||||||
|
|
||||||
@@ -139,11 +140,10 @@ g3/
|
|||||||
- Multi-language code execution support
|
- Multi-language code execution support
|
||||||
- Error handling and result formatting
|
- Error handling and result formatting
|
||||||
|
|
||||||
**Supported Languages:**
|
**Supported Execution:**
|
||||||
- **Bash/Shell**: Direct command execution with streaming output
|
- **Bash/Shell**: Direct command execution with streaming output (primary use case)
|
||||||
- **Python**: Script execution via temporary files
|
- **Python**: Script execution via temporary files (legacy support)
|
||||||
- **JavaScript**: Node.js-based execution
|
- **JavaScript**: Node.js-based execution (legacy support)
|
||||||
- **Extensible**: Framework for adding additional language support
|
|
||||||
|
|
||||||
**Key Features:**
|
**Key Features:**
|
||||||
- **Streaming Output**: Real-time command output display
|
- **Streaming Output**: Real-time command output display
|
||||||
@@ -161,7 +161,7 @@ g3/
|
|||||||
- CLI argument integration
|
- CLI argument integration
|
||||||
|
|
||||||
**Configuration Hierarchy:**
|
**Configuration Hierarchy:**
|
||||||
1. Default configuration (embedded in code)
|
1. Default configuration (Databricks provider with OAuth)
|
||||||
2. Configuration files (`~/.config/g3/config.toml`, `./g3.toml`)
|
2. Configuration files (`~/.config/g3/config.toml`, `./g3.toml`)
|
||||||
3. Environment variables (`G3_*`)
|
3. Environment variables (`G3_*`)
|
||||||
4. CLI arguments (highest priority)
|
4. CLI arguments (highest priority)
|
||||||
@@ -216,7 +216,7 @@ Advanced autonomous operation with coach-player feedback:
|
|||||||
|
|
||||||
## Provider Comparison
|
## Provider Comparison
|
||||||
|
|
||||||
| Feature | Anthropic | Databricks | Embedded |
|
| Feature | Anthropic | Databricks (Default) | Embedded |
|
||||||
|---------|-----------|------------|----------|
|
|---------|-----------|------------|----------|
|
||||||
| **Cost** | Pay per token | Pay per token | Free after download |
|
| **Cost** | Pay per token | Pay per token | Free after download |
|
||||||
| **Privacy** | Data sent to API | Data sent to API | Completely local |
|
| **Privacy** | Data sent to API | Data sent to API | Completely local |
|
||||||
@@ -242,7 +242,7 @@ max_tokens = 8192
|
|||||||
temperature = 0.1
|
temperature = 0.1
|
||||||
```
|
```
|
||||||
|
|
||||||
### Enterprise Setup (Databricks)
|
### Enterprise Setup (Databricks - Default)
|
||||||
```toml
|
```toml
|
||||||
[providers]
|
[providers]
|
||||||
default_provider = "databricks"
|
default_provider = "databricks"
|
||||||
@@ -314,7 +314,7 @@ g3 --retro --theme dracula
|
|||||||
# Full-screen terminal interface
|
# Full-screen terminal interface
|
||||||
```
|
```
|
||||||
|
|
||||||
## Future Enhancements
|
## Implementation Details
|
||||||
|
|
||||||
### Planned Features
|
### Planned Features
|
||||||
- **Plugin System**: Custom tool and provider plugins
|
- **Plugin System**: Custom tool and provider plugins
|
||||||
@@ -341,10 +341,38 @@ g3 --retro --theme dracula
|
|||||||
- **Testing**: Unit tests, integration tests, and property-based testing
|
- **Testing**: Unit tests, integration tests, and property-based testing
|
||||||
|
|
||||||
### Performance Considerations
|
### Performance Considerations
|
||||||
- **Async-First**: All I/O operations are asynchronous
|
- **Async-First**: All I/O operations are asynchronous (Tokio runtime)
|
||||||
- **Streaming**: Real-time response processing where possible
|
- **Streaming**: Real-time response processing where possible
|
||||||
- **Memory Efficiency**: Careful memory management for large contexts
|
- **Memory Efficiency**: Careful memory management for large contexts
|
||||||
- **Caching**: Strategic caching of expensive operations
|
- **Caching**: Strategic caching of expensive operations
|
||||||
- **Profiling**: Regular performance profiling and optimization
|
- **Profiling**: Regular performance profiling and optimization
|
||||||
|
|
||||||
This design document reflects the current state of G3 as a mature, production-ready AI coding agent with sophisticated architecture and comprehensive feature set.
|
This design document reflects the current state of G3 as a mature, production-ready AI coding agent with sophisticated architecture and comprehensive feature set.
|
||||||
|
|
||||||
|
## Current Implementation Status
|
||||||
|
|
||||||
|
### Fully Implemented
|
||||||
|
- ✅ **Core Agent Engine**: Complete with streaming, tool execution, and context management
|
||||||
|
- ✅ **Provider System**: Anthropic, Databricks, and Embedded providers with OAuth support
|
||||||
|
- ✅ **Tool System**: All 5 core tools (shell, read_file, write_file, str_replace, final_output)
|
||||||
|
- ✅ **CLI Interface**: Interactive mode, single-shot mode, retro TUI
|
||||||
|
- ✅ **Autonomous Mode**: Coach-player feedback loop with requirements.md processing
|
||||||
|
- ✅ **Configuration**: TOML-based config with environment overrides
|
||||||
|
- ✅ **Error Handling**: Comprehensive retry logic and error classification
|
||||||
|
- ✅ **Session Logging**: Automatic session tracking and JSON logs
|
||||||
|
- ✅ **Context Management**: Auto-summarization at 80% capacity
|
||||||
|
|
||||||
|
### Architecture Highlights
|
||||||
|
- **Workspace**: 5 crates with clear separation of concerns
|
||||||
|
- **Dependencies**: Modern Rust ecosystem (Tokio, Clap, Serde, etc.)
|
||||||
|
- **Streaming**: Real-time response processing with tool call detection
|
||||||
|
- **Cross-Platform**: Works on macOS, Linux, and Windows
|
||||||
|
- **GPU Support**: Metal acceleration for local models on macOS
|
||||||
|
|
||||||
|
### Key Files
|
||||||
|
- `src/main.rs`: 6-line entry point delegating to g3-cli
|
||||||
|
- `crates/g3-core/src/lib.rs`: 2953 lines - main agent implementation
|
||||||
|
- `crates/g3-cli/src/lib.rs`: 1354 lines - CLI and interaction modes
|
||||||
|
- `crates/g3-providers/src/lib.rs`: 144 lines - provider trait and registry
|
||||||
|
- `crates/g3-config/src/lib.rs`: 265 lines - configuration management
|
||||||
|
- `crates/g3-execution/src/lib.rs`: 284 lines - code execution engine
|
||||||
|
|||||||
@@ -149,7 +149,6 @@ impl UiWriter for ConsoleUiWriter {
|
|||||||
}
|
}
|
||||||
|
|
||||||
fn print_agent_prompt(&self) {
|
fn print_agent_prompt(&self) {
|
||||||
print!(" ");
|
|
||||||
let _ = io::stdout().flush();
|
let _ = io::stdout().flush();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -274,22 +274,37 @@ impl AnthropicProvider {
|
|||||||
let mut current_tool_calls: Vec<ToolCall> = Vec::new();
|
let mut current_tool_calls: Vec<ToolCall> = Vec::new();
|
||||||
let mut partial_tool_json = String::new(); // Accumulate partial JSON for tool calls
|
let mut partial_tool_json = String::new(); // Accumulate partial JSON for tool calls
|
||||||
let mut accumulated_usage: Option<Usage> = None;
|
let mut accumulated_usage: Option<Usage> = None;
|
||||||
|
let mut byte_buffer = Vec::new(); // Buffer for incomplete UTF-8 sequences
|
||||||
|
|
||||||
while let Some(chunk_result) = stream.next().await {
|
while let Some(chunk_result) = stream.next().await {
|
||||||
match chunk_result {
|
match chunk_result {
|
||||||
Ok(chunk) => {
|
Ok(chunk) => {
|
||||||
let chunk_str = match std::str::from_utf8(&chunk) {
|
// Append new bytes to our buffer
|
||||||
Ok(s) => s,
|
byte_buffer.extend_from_slice(&chunk);
|
||||||
|
|
||||||
|
// Try to convert the entire buffer to UTF-8
|
||||||
|
let chunk_str = match std::str::from_utf8(&byte_buffer) {
|
||||||
|
Ok(s) => {
|
||||||
|
// Successfully converted entire buffer, clear it and use the string
|
||||||
|
let result = s.to_string();
|
||||||
|
byte_buffer.clear();
|
||||||
|
result
|
||||||
|
}
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
error!("Invalid UTF-8 in stream chunk: {}", e);
|
// Check if this is an incomplete sequence at the end
|
||||||
let _ = tx
|
let valid_up_to = e.valid_up_to();
|
||||||
.send(Err(anyhow!("Invalid UTF-8 in stream chunk: {}", e)))
|
if valid_up_to > 0 {
|
||||||
.await;
|
// We have some valid UTF-8, extract it and keep the rest for next iteration
|
||||||
return accumulated_usage;
|
let valid_bytes = byte_buffer.drain(..valid_up_to).collect::<Vec<_>>();
|
||||||
|
std::str::from_utf8(&valid_bytes).unwrap().to_string()
|
||||||
|
} else {
|
||||||
|
// No valid UTF-8 at all, skip this chunk and continue
|
||||||
|
continue;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
buffer.push_str(chunk_str);
|
buffer.push_str(&chunk_str);
|
||||||
|
|
||||||
// Process complete lines
|
// Process complete lines
|
||||||
while let Some(line_end) = buffer.find('\n') {
|
while let Some(line_end) = buffer.find('\n') {
|
||||||
|
|||||||
@@ -299,6 +299,7 @@ impl DatabricksProvider {
|
|||||||
std::collections::HashMap::new(); // index -> (id, name, args)
|
std::collections::HashMap::new(); // index -> (id, name, args)
|
||||||
let mut incomplete_data_line = String::new(); // Buffer for incomplete data: lines
|
let mut incomplete_data_line = String::new(); // Buffer for incomplete data: lines
|
||||||
let accumulated_usage: Option<Usage> = None;
|
let accumulated_usage: Option<Usage> = None;
|
||||||
|
let mut byte_buffer = Vec::new(); // Buffer for incomplete UTF-8 sequences
|
||||||
|
|
||||||
while let Some(chunk_result) = stream.next().await {
|
while let Some(chunk_result) = stream.next().await {
|
||||||
match chunk_result {
|
match chunk_result {
|
||||||
@@ -306,29 +307,42 @@ impl DatabricksProvider {
|
|||||||
// Debug: Log raw bytes received
|
// Debug: Log raw bytes received
|
||||||
debug!("Raw SSE bytes received: {} bytes", chunk.len());
|
debug!("Raw SSE bytes received: {} bytes", chunk.len());
|
||||||
|
|
||||||
let chunk_str = match std::str::from_utf8(&chunk) {
|
// Append new bytes to our buffer
|
||||||
|
byte_buffer.extend_from_slice(&chunk);
|
||||||
|
|
||||||
|
// Try to convert the entire buffer to UTF-8
|
||||||
|
let chunk_str = match std::str::from_utf8(&byte_buffer) {
|
||||||
Ok(s) => {
|
Ok(s) => {
|
||||||
// Debug: Log raw string content (truncated for large chunks)
|
// Successfully converted entire buffer, clear it and use the string
|
||||||
if s.len() > 1000 {
|
let result = s.to_string();
|
||||||
debug!(
|
byte_buffer.clear();
|
||||||
"Raw SSE string content (first 500 chars): {:?}...",
|
result
|
||||||
&s[..500]
|
|
||||||
);
|
|
||||||
} else {
|
|
||||||
debug!("Raw SSE string content: {:?}", s);
|
|
||||||
}
|
|
||||||
s
|
|
||||||
}
|
}
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
error!("Invalid UTF-8 in stream chunk: {}", e);
|
// Check if this is an incomplete sequence at the end
|
||||||
let _ = tx
|
let valid_up_to = e.valid_up_to();
|
||||||
.send(Err(anyhow!("Invalid UTF-8 in stream chunk: {}", e)))
|
if valid_up_to > 0 {
|
||||||
.await;
|
// We have some valid UTF-8, extract it and keep the rest for next iteration
|
||||||
return accumulated_usage;
|
let valid_bytes = byte_buffer.drain(..valid_up_to).collect::<Vec<_>>();
|
||||||
|
std::str::from_utf8(&valid_bytes).unwrap().to_string()
|
||||||
|
} else {
|
||||||
|
// No valid UTF-8 at all, skip this chunk and continue
|
||||||
|
continue;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
buffer.push_str(chunk_str);
|
// Debug: Log raw string content (truncated for large chunks)
|
||||||
|
if chunk_str.len() > 1000 {
|
||||||
|
debug!(
|
||||||
|
"Raw SSE string content (first 500 chars): {:?}...",
|
||||||
|
&chunk_str[..500]
|
||||||
|
);
|
||||||
|
} else {
|
||||||
|
debug!("Raw SSE string content: {:?}", chunk_str);
|
||||||
|
}
|
||||||
|
|
||||||
|
buffer.push_str(&chunk_str);
|
||||||
|
|
||||||
// Process complete lines, but handle incomplete data: lines specially
|
// Process complete lines, but handle incomplete data: lines specially
|
||||||
while let Some(line_end) = buffer.find('\n') {
|
while let Some(line_end) = buffer.find('\n') {
|
||||||
|
|||||||
Reference in New Issue
Block a user