Files
g3/DESIGN.md
2025-09-15 09:07:12 +10:00

6.5 KiB

G3 General Purpose AI Agent - Design Document

Overview

G3 is a code-first AI agent that helps you complete tasks by writing and executing code or scripts. Instead of just giving advice, G3 solves problems by generating executable code in the appropriate language.

Core Principles

  1. Code-First Philosophy: Always try to solve problems with executable code
  2. Multi-Language Support: Generate scripts in Python, Bash, JavaScript, Rust, etc.
  3. Unix Philosophy: Small, focused tools that do one thing well
  4. Modularity: Clear separation of concerns
  5. Composability: Components can be combined in different ways
  6. Performance: Blazing fast execution

Architecture

High-Level Components

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   CLI Module    │    │  Core Engine    │    │ LLM Providers   │
│                 │    │                 │    │                 │
│ - Task commands │◄──►│ - Task          │◄──►│ - OpenAI        │
│ - Interactive   │    │   interpretation│    │ - Anthropic     │
│   mode          │    │ - Code          │    │ - Embedded      │
│ - Code exec     │    │   generation    │    │   (llama.cpp)   │
│   approval      │    │ - Script        │    │ - Custom APIs   │
│                 │    │   execution     │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
                    ┌─────────────────┐
                    │   Execution     │
                    │   Engine        │
                    │                 │
                    │ - Python        │
                    │ - Bash/Shell    │
                    │ - JavaScript    │
                    │ - Rust          │
                    │ - Sandboxing    │
                    └─────────────────┘

Module Breakdown

1. CLI Module (g3-cli)

  • Responsibility: User interface and task interpretation
  • New Features:
    • Progress indicators for script execution

2. Core Engine (g3-core)

  • Responsibility: Task interpretation and code generation
  • New Features:
    • Task analysis and decomposition
    • Language selection based on task type
    • Code generation with execution context
    • Script template system
    • Autonomous execution of generated code

3. LLM Providers (g3-providers)

  • Responsibility: LLM communication and model abstraction
  • Supported Providers:
    • OpenAI: GPT-4, GPT-3.5-turbo via API
    • Anthropic: Claude models via API
    • Embedded: Local open-weights models via llama.cpp
  • Enhanced Prompts:
    • Code-first system prompts
    • Language-specific generation instructions

5. Embedded Provider (g3-core/providers/embedded) - NEW

  • Responsibility: Local model inference using llama.cpp
  • Features:
    • GGUF model support (Llama, CodeLlama, Mistral, etc.)
    • GPU acceleration via CUDA/Metal
    • Configurable context length and generation parameters
    • Async-compatible inference without blocking
    • Thread-safe model access
    • Stop sequence detection

4. Execution Engine (g3-execution) - NEW

  • Responsibility: Safe code execution
  • Features:
    • Multi-language script execution
    • Sandboxing and security
    • Resource limits
    • Output capture and formatting
    • Error handling and recovery

Task Types and Language Selection

Task Type Preferred Language Use Cases
Data Processing Python CSV/JSON analysis, data transformation
File Operations Bash/Shell File manipulation, backups, organization
System Admin Bash/Shell Process management, system monitoring
Text Processing Python/Bash Log analysis, text transformation
Database Python/SQL Data migration, queries, reporting
Image/Media Python Image processing, format conversion
Development Rust Code generation, project setup

Implementation Plan

Phase 1: Core Refactoring

  1. Update CLI commands for task-oriented interface
  2. Enhance system prompts for code-first approach
  3. Add basic code execution capabilities
  4. Update interactive mode messaging

Phase 2: Enhanced Provider Support

  1. Implement embedded model provider using llama.cpp
  2. Add GGUF model support for local inference
  3. Configure GPU acceleration and performance optimization
  4. Add comprehensive logging and debugging support

Phase 3: Advanced Features (Future)

  1. Model quantization and optimization
  2. Multi-model ensemble support
  3. Advanced code execution sandboxing
  4. Plugin system for custom providers
  5. Web interface for remote access

Provider Comparison

Feature OpenAI Anthropic Embedded
Cost Pay per token Pay per token Free after download
Privacy Data sent to API Data sent to API Completely local
Performance Very fast Very fast Depends on hardware
Model Quality Excellent Excellent Good (varies by model)
Offline Support No No Yes
Setup Complexity API key only API key only Model download required
Hardware Requirements None None 4-16GB RAM, optional GPU

Configuration Examples

Cloud-First Setup

[providers]
default_provider = "openai"

[providers.openai]
api_key = "sk-..."
model = "gpt-4"

Privacy-First Setup

[providers]
default_provider = "embedded"

[providers.embedded]
model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
model_type = "codellama"
gpu_layers = 32

Hybrid Setup

[providers]
default_provider = "embedded"

# Use embedded for most tasks
[providers.embedded]
model_path = "~/.cache/g3/models/codellama-7b-instruct.Q4_K_M.gguf"
model_type = "codellama"
gpu_layers = 32

# Fallback to cloud for complex tasks
[providers.openai]
api_key = "sk-..."
model = "gpt-4"