6.5 KiB
6.5 KiB
G3 General Purpose AI Agent - Design Document
Overview
G3 is a code-first AI agent that helps you complete tasks by writing and executing code or scripts. Instead of just giving advice, G3 solves problems by generating executable code in the appropriate language.
Core Principles
- Code-First Philosophy: Always try to solve problems with executable code
- Multi-Language Support: Generate scripts in Python, Bash, JavaScript, Rust, etc.
- Unix Philosophy: Small, focused tools that do one thing well
- Modularity: Clear separation of concerns
- Composability: Components can be combined in different ways
- Performance: Blazing fast execution
Architecture
High-Level Components
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ CLI Module │ │ Core Engine │ │ LLM Providers │
│ │ │ │ │ │
│ - Task commands │◄──►│ - Task │◄──►│ - OpenAI │
│ - Interactive │ │ interpretation│ │ - Anthropic │
│ mode │ │ - Code │ │ - Embedded │
│ - Code exec │ │ generation │ │ (llama.cpp) │
│ approval │ │ - Script │ │ - Custom APIs │
│ │ │ execution │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────────┼───────────────────────┘
│
┌─────────────────┐
│ Execution │
│ Engine │
│ │
│ - Python │
│ - Bash/Shell │
│ - JavaScript │
│ - Rust │
│ - Sandboxing │
└─────────────────┘
Module Breakdown
1. CLI Module (g3-cli)
- Responsibility: User interface and task interpretation
- New Features:
- Progress indicators for script execution
2. Core Engine (g3-core)
- Responsibility: Task interpretation and code generation
- New Features:
- Task analysis and decomposition
- Language selection based on task type
- Code generation with execution context
- Script template system
- Autonomous execution of generated code
3. LLM Providers (g3-providers)
- Responsibility: LLM communication and model abstraction
- Supported Providers:
- OpenAI: GPT-4, GPT-3.5-turbo via API
- Anthropic: Claude models via API
- Embedded: Local open-weights models via llama.cpp
- Enhanced Prompts:
- Code-first system prompts
- Language-specific generation instructions
5. Embedded Provider (g3-core/providers/embedded) - NEW
- Responsibility: Local model inference using llama.cpp
- Features:
- GGUF model support (Llama, CodeLlama, Mistral, etc.)
- GPU acceleration via CUDA/Metal
- Configurable context length and generation parameters
- Async-compatible inference without blocking
- Thread-safe model access
- Stop sequence detection
4. Execution Engine (g3-execution) - NEW
- Responsibility: Safe code execution
- Features:
- Multi-language script execution
- Sandboxing and security
- Resource limits
- Output capture and formatting
- Error handling and recovery
Task Types and Language Selection
| Task Type | Preferred Language | Use Cases |
|---|---|---|
| Data Processing | Python | CSV/JSON analysis, data transformation |
| File Operations | Bash/Shell | File manipulation, backups, organization |
| System Admin | Bash/Shell | Process management, system monitoring |
| Text Processing | Python/Bash | Log analysis, text transformation |
| Database | Python/SQL | Data migration, queries, reporting |
| Image/Media | Python | Image processing, format conversion |
| Development | Rust | Code generation, project setup |
Implementation Plan
Phase 1: Core Refactoring ✅
- ✅ Update CLI commands for task-oriented interface
- ✅ Enhance system prompts for code-first approach
- ✅ Add basic code execution capabilities
- ✅ Update interactive mode messaging
Phase 2: Enhanced Provider Support ✅
- ✅ Implement embedded model provider using llama.cpp
- ✅ Add GGUF model support for local inference
- ✅ Configure GPU acceleration and performance optimization
- ✅ Add comprehensive logging and debugging support
Phase 3: Advanced Features (Future)
- Model quantization and optimization
- Multi-model ensemble support
- Advanced code execution sandboxing
- Plugin system for custom providers
- Web interface for remote access
Provider Comparison
| Feature | OpenAI | Anthropic | Embedded |
|---|---|---|---|
| Cost | Pay per token | Pay per token | Free after download |
| Privacy | Data sent to API | Data sent to API | Completely local |
| Performance | Very fast | Very fast | Depends on hardware |
| Model Quality | Excellent | Excellent | Good (varies by model) |
| Offline Support | No | No | Yes |
| Setup Complexity | API key only | API key only | Model download required |
| Hardware Requirements | None | None | 4-16GB RAM, optional GPU |
Configuration Examples
Cloud-First Setup
[providers]
default_provider = "openai"
[providers.openai]
api_key = "sk-..."
model = "gpt-4"
Privacy-First Setup
[providers]
default_provider = "embedded"
[providers.embedded]
model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
model_type = "codellama"
gpu_layers = 32
Hybrid Setup
[providers]
default_provider = "embedded"
# Use embedded for most tasks
[providers.embedded]
model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
model_type = "codellama"
gpu_layers = 32
# Fallback to cloud for complex tasks
[providers.openai]
api_key = "sk-..."
model = "gpt-4"