Files
g3/DESIGN.md
2025-09-15 09:07:12 +10:00

173 lines
6.5 KiB
Markdown

# G3 General Purpose AI Agent - Design Document
## Overview
G3 is a **code-first AI agent** that helps you complete tasks by writing and executing code or scripts. Instead of just giving advice, G3 solves problems by generating executable code in the appropriate language.
## Core Principles
1. **Code-First Philosophy**: Always try to solve problems with executable code
2. **Multi-Language Support**: Generate scripts in Python, Bash, JavaScript, Rust, etc.
3. **Unix Philosophy**: Small, focused tools that do one thing well
4. **Modularity**: Clear separation of concerns
5. **Composability**: Components can be combined in different ways
6. **Performance**: Blazing fast execution
## Architecture
### High-Level Components
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ CLI Module │ │ Core Engine │ │ LLM Providers │
│ │ │ │ │ │
│ - Task commands │◄──►│ - Task │◄──►│ - OpenAI │
│ - Interactive │ │ interpretation│ │ - Anthropic │
│ mode │ │ - Code │ │ - Embedded │
│ - Code exec │ │ generation │ │ (llama.cpp) │
│ approval │ │ - Script │ │ - Custom APIs │
│ │ │ execution │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
┌─────────────────┐
│ Execution │
│ Engine │
│ │
│ - Python │
│ - Bash/Shell │
│ - JavaScript │
│ - Rust │
│ - Sandboxing │
└─────────────────┘
```
### Module Breakdown
#### 1. CLI Module (`g3-cli`)
- **Responsibility**: User interface and task interpretation
- **New Features**:
- Progress indicators for script execution
#### 2. Core Engine (`g3-core`)
- **Responsibility**: Task interpretation and code generation
- **New Features**:
- Task analysis and decomposition
- Language selection based on task type
- Code generation with execution context
- Script template system
- Autonomous execution of generated code
#### 3. LLM Providers (`g3-providers`)
- **Responsibility**: LLM communication and model abstraction
- **Supported Providers**:
- **OpenAI**: GPT-4, GPT-3.5-turbo via API
- **Anthropic**: Claude models via API
- **Embedded**: Local open-weights models via llama.cpp
- **Enhanced Prompts**:
- Code-first system prompts
- Language-specific generation instructions
#### 5. Embedded Provider (`g3-core/providers/embedded`) - NEW
- **Responsibility**: Local model inference using llama.cpp
- **Features**:
- GGUF model support (Llama, CodeLlama, Mistral, etc.)
- GPU acceleration via CUDA/Metal
- Configurable context length and generation parameters
- Async-compatible inference without blocking
- Thread-safe model access
- Stop sequence detection
#### 4. Execution Engine (`g3-execution`) - NEW
- **Responsibility**: Safe code execution
- **Features**:
- Multi-language script execution
- Sandboxing and security
- Resource limits
- Output capture and formatting
- Error handling and recovery
### Task Types and Language Selection
| Task Type | Preferred Language | Use Cases |
|-----------|-------------------|-----------|
| Data Processing | Python | CSV/JSON analysis, data transformation |
| File Operations | Bash/Shell | File manipulation, backups, organization |
| System Admin | Bash/Shell | Process management, system monitoring |
| Text Processing | Python/Bash | Log analysis, text transformation |
| Database | Python/SQL | Data migration, queries, reporting |
| Image/Media | Python | Image processing, format conversion |
| Development | Rust | Code generation, project setup |
## Implementation Plan
### Phase 1: Core Refactoring ✅
1. ✅ Update CLI commands for task-oriented interface
2. ✅ Enhance system prompts for code-first approach
3. ✅ Add basic code execution capabilities
4. ✅ Update interactive mode messaging
### Phase 2: Enhanced Provider Support ✅
1. ✅ Implement embedded model provider using llama.cpp
2. ✅ Add GGUF model support for local inference
3. ✅ Configure GPU acceleration and performance optimization
4. ✅ Add comprehensive logging and debugging support
### Phase 3: Advanced Features (Future)
1. Model quantization and optimization
2. Multi-model ensemble support
3. Advanced code execution sandboxing
4. Plugin system for custom providers
5. Web interface for remote access
## Provider Comparison
| Feature | OpenAI | Anthropic | Embedded |
|---------|--------|-----------|----------|
| **Cost** | Pay per token | Pay per token | Free after download |
| **Privacy** | Data sent to API | Data sent to API | Completely local |
| **Performance** | Very fast | Very fast | Depends on hardware |
| **Model Quality** | Excellent | Excellent | Good (varies by model) |
| **Offline Support** | No | No | Yes |
| **Setup Complexity** | API key only | API key only | Model download required |
| **Hardware Requirements** | None | None | 4-16GB RAM, optional GPU |
## Configuration Examples
### Cloud-First Setup
```toml
[providers]
default_provider = "openai"
[providers.openai]
api_key = "sk-..."
model = "gpt-4"
```
### Privacy-First Setup
```toml
[providers]
default_provider = "embedded"
[providers.embedded]
model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
model_type = "codellama"
gpu_layers = 32
```
### Hybrid Setup
```toml
[providers]
default_provider = "embedded"
# Use embedded for most tasks
[providers.embedded]
model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
model_type = "codellama"
gpu_layers = 32
# Fallback to cloud for complex tasks
[providers.openai]
api_key = "sk-..."
model = "gpt-4"
```