Make research skill self-contained without external scripts

- Rewrite SKILL.md with inline instructions to spawn g3 --agent scout directly - Extend read_file to handle embedded skill paths (<embedded:name>/SKILL.md) - Remove scripts field from EmbeddedSkill struct (no longer needed) - Delete extraction.rs module (was only for script extraction) - Delete g3-research bash script - Remove obsolete Async Research Tool section from workspace memory Skills are now fully portable - they work when g3 is installed as a binary without access to source files. Agents can read embedded skill content via read_file with the special <embedded:...> path syntax.
2026-02-05 14:22:17 +11:00
parent 788debb93a
commit cff32bf0ba
7 changed files with 130 additions and 710 deletions
--- a/skills/research/SKILL.md
+++ b/skills/research/SKILL.md
@@ -5,115 +5,114 @@ license: Apache-2.0
 compatibility: Requires g3 binary in PATH. WebDriver (Safari or Chrome) recommended for best results.
 metadata:
  author: g3
-  version: "1.0"
+  version: "2.0"
 ---

 # Research Skill

-Perform asynchronous web research without blocking your current work. Research runs in the background and saves results to disk for you to read when ready.
+Perform asynchronous web research without blocking your current work. Research runs in the background and results are saved to disk.

 ## Quick Start

 ```bash
-# Start research (ALWAYS use background_process, never blocking shell)
-background_process("research-<topic>", ".g3/bin/g3-research 'Your research question here'")
+# 1. Create research directory and status file
+RESEARCH_ID="research_$(date +%s)_$(head -c 3 /dev/urandom | xxd -p)"
+mkdir -p ".g3/research/$RESEARCH_ID"
+echo '{"id":"'$RESEARCH_ID'","status":"running","query":"YOUR QUERY"}' > ".g3/research/$RESEARCH_ID/status.json"

-# Check status
-shell(".g3/bin/g3-research --status <research-id>")
-# Or list all:
-shell(".g3/bin/g3-research --list")
+# 2. Start research in background
+background_process("research-topic", "g3 --agent scout --new-session --quiet 'Your research question' > .g3/research/$RESEARCH_ID/report.md 2>&1 && sed -i '' 's/running/complete/' .g3/research/$RESEARCH_ID/status.json || sed -i '' 's/running/failed/' .g3/research/$RESEARCH_ID/status.json")

-# Read the report when complete
-read_file(".g3/research/<research-id>/report.md")
+# 3. Check status
+cat .g3/research/$RESEARCH_ID/status.json
+
+# 4. Read report when complete
+read_file(".g3/research/$RESEARCH_ID/report.md")
 ```

-## How It Works
+## Step-by-Step Instructions

-1. **Start research** - The `g3-research` script spawns a scout agent that performs web research
-2. **Background execution** - Research runs asynchronously; you can continue other work
-3. **Filesystem handoff** - Results are written to `.g3/research/<id>/` with machine-readable status
-4. **Read when ready** - Use `read_file` to load the report into context only when needed
+### 1. Generate a Unique Research ID
+
+Use shell to create a unique ID and directory:
+
+```bash
+shell("RESEARCH_ID=\"research_$(date +%s)_$(head -c 3 /dev/urandom | xxd -p)\" && mkdir -p \".g3/research/$RESEARCH_ID\" && echo $RESEARCH_ID")
+```
+
+Save the returned ID for later use.
+
+### 2. Write Initial Status File
+
+```bash
+shell("echo '{\"id\":\"<RESEARCH_ID>\",\"status\":\"running\",\"query\":\"<YOUR_QUERY>\",\"started_at\":\"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'\"}' > .g3/research/<RESEARCH_ID>/status.json")
+```
+
+### 3. Start the Scout Agent
+
+Use `background_process` to run the scout agent (NEVER use blocking `shell`):
+
+```bash
+background_process("research-<topic>", "g3 --agent scout --new-session --quiet '<Your detailed research question>' > .g3/research/<RESEARCH_ID>/report.md 2>&1; if [ $? -eq 0 ]; then sed -i '' 's/running/complete/' .g3/research/<RESEARCH_ID>/status.json; else sed -i '' 's/running/failed/' .g3/research/<RESEARCH_ID>/status.json; fi")
+```
+
+**Important flags:**
+- `--agent scout` - Uses the scout agent optimized for web research
+- `--new-session` - Starts a fresh session
+- `--quiet` - Reduces UI noise in output
+
+### 4. Check Research Status
+
+```bash
+shell("cat .g3/research/<RESEARCH_ID>/status.json")
+```
+
+Status values:
+- `running` - Research in progress
+- `complete` - Report ready to read  
+- `failed` - Error occurred
+
+### 5. Read the Report
+
+Once status is `complete`:
+
+```bash
+read_file(".g3/research/<RESEARCH_ID>/report.md")
+```

 ## Directory Structure

 ```
 .g3/research/
-├── research_1738700000_a1b2c3/
-│   ├── status.json      # Machine-readable status
-│   └── report.md        # The research brief (when complete)
-└── research_1738700100_d4e5f6/
-    ├── status.json
-    └── report.md
+└── research_1738700000_a1b2c3/
+    ├── status.json      # Machine-readable status
+    └── report.md        # The research brief (when complete)
 ```

-## status.json Schema
-
-```json
-{
-  "id": "research_1738700000_a1b2c3",
-  "query": "What are the best Rust async runtimes?",
-  "status": "complete",
-  "started_at": "2026-02-04T12:00:00Z",
-  "completed_at": "2026-02-04T12:01:30Z",
-  "report_path": ".g3/research/research_1738700000_a1b2c3/report.md",
-  "error": null
-}
-```
-
-**Status values:**
- `running` - Research in progress
- `complete` - Report ready to read
- `failed` - Error occurred (check `error` field)
-
-## Commands
-
-### Start Research
+## Example: Complete Workflow

 ```bash
-.g3/bin/g3-research "<query>"
-```
+# Step 1: Create research task
+shell("RESEARCH_ID=\"research_$(date +%s)_$(head -c 3 /dev/urandom | xxd -p)\" && mkdir -p \".g3/research/$RESEARCH_ID\" && echo '{\"id\":\"'$RESEARCH_ID'\",\"status\":\"running\",\"query\":\"Rust async runtimes comparison\"}' > \".g3/research/$RESEARCH_ID/status.json\" && echo $RESEARCH_ID")
+# Returns: research_1738700000_a1b2c3

-Outputs the research ID and path on success. **Always run via `background_process`**, not `shell`.
+# Step 2: Start scout in background  
+background_process("research-rust-async", "g3 --agent scout --new-session --quiet 'Compare Tokio vs async-std vs smol for Rust async runtimes. Include performance, ecosystem, and ease of use.' > .g3/research/research_1738700000_a1b2c3/report.md 2>&1; [ $? -eq 0 ] && sed -i '' 's/running/complete/' .g3/research/research_1738700000_a1b2c3/status.json || sed -i '' 's/running/failed/' .g3/research/research_1738700000_a1b2c3/status.json")

-### Check Status
-
-```bash
-# Check specific research
-.g3/bin/g3-research --status <research-id>
-
-# List all research tasks
-.g3/bin/g3-research --list
-```
-
-Outputs JSON for machine parsing.
-
-### Read Report
-
-Once status is `complete`, read the report:
-
-```bash
-read_file(".g3/research/<research-id>/report.md")
-```
-
-**Tip:** If the report is large, use partial reads:
-```bash
-read_file(".g3/research/<id>/report.md", start=0, end=2000)
-```
-
-## Example Workflow
-
-```
-# 1. Start research on async runtimes
-background_process("research-async", ".g3/bin/g3-research 'Compare Tokio vs async-std vs smol for Rust async runtimes'")
-
-# 2. Continue with other work while research runs...
+# Step 3: Continue other work...
 shell("cargo check")

-# 3. Check if research is done
-shell(".g3/bin/g3-research --list")
+# Step 4: Check if done
+shell("cat .g3/research/research_1738700000_a1b2c3/status.json")

-# 4. Read the report
-read_file(".g3/research/research_1738700000_abc123/report.md")
+# Step 5: Read report
+read_file(".g3/research/research_1738700000_a1b2c3/report.md")
+```
+
+## Listing All Research Tasks
+
+```bash
+shell("for f in .g3/research/*/status.json; do cat \"$f\" 2>/dev/null; echo; done")
 ```

 ## Best Practices
@@ -121,7 +120,7 @@ read_file(".g3/research/research_1738700000_abc123/report.md")
 1. **Always use `background_process`** - Never run research with blocking `shell`
 2. **Be specific** - Narrow queries get better results faster
 3. **Read selectively** - Only load reports into context when you need them
-4. **Check status first** - Don't try to read reports that aren't complete
+4. **Check status first** - Don't try to read reports that aren't complete yet

 ## Troubleshooting

@@ -129,16 +128,11 @@ read_file(".g3/research/research_1738700000_abc123/report.md")
 - Try a more specific query
 - Complex topics may take 1-2 minutes

-### WebDriver not available
+### WebDriver not available  
 - Research will still work but may have limited web access
- Install Safari WebDriver or Chrome for best results
+- The scout agent will fall back to shell-based methods

 ### Report is empty or failed
- Check `status.json` for error details
+- Check status.json for the status
+- Look at the report.md file for any error output
 - The query may be too broad or the topic too obscure
-
-## Notes
-
- Research results accumulate in `.g3/research/` - they are not auto-cleaned
- Each research task gets a unique ID based on timestamp
- Multiple concurrent research tasks are supported
--- a/skills/research/g3-research
+++ b/skills/research/g3-research
@@ -1,338 +0,0 @@
-#!/bin/bash
-#
-# g3-research - Perform web research via scout agent with filesystem handoff
-#
-# Usage:
-#   g3-research "<query>"           Start new research
-#   g3-research --status <id>       Check status of specific research
-#   g3-research --list              List all research tasks
-#   g3-research --help              Show this help
-#
-# Research results are stored in .g3/research/<id>/
-#   - status.json: Machine-readable status
-#   - report.md: The research brief (when complete)
-
-set -euo pipefail
-
-# Configuration
-RESEARCH_DIR=".g3/research"
-SCOUT_AGENT="scout"
-
-# Report markers (must match scout agent output)
-REPORT_START_MARKER="---SCOUT_REPORT_START---"
-REPORT_END_MARKER="---SCOUT_REPORT_END---"
-
-#######################################
-# Generate a unique research ID
-#######################################
-generate_id() {
-    local timestamp
-    local random_suffix
-    timestamp=$(date +%s)
-    random_suffix=$(head -c 6 /dev/urandom | xxd -p | head -c 6)
-    echo "research_${timestamp}_${random_suffix}"
-}
-
-#######################################
-# Get current ISO 8601 timestamp
-#######################################
-get_timestamp() {
-    date -u +"%Y-%m-%dT%H:%M:%SZ"
-}
-
-#######################################
-# Write status.json file
-# Arguments:
-#   $1 - research directory
-#   $2 - id
-#   $3 - query
-#   $4 - status (running|complete|failed)
-#   $5 - started_at
-#   $6 - completed_at (optional, use "null" for running)
-#   $7 - error (optional, use "null" for success)
-#######################################
-write_status() {
-    local dir="$1"
-    local id="$2"
-    local query="$3"
-    local status="$4"
-    local started_at="$5"
-    local completed_at="$6"
-    local error="$7"
-    
-    # Escape query for JSON (handle quotes and newlines)
-    local escaped_query
-    escaped_query=$(echo -n "$query" | sed 's/\\/\\\\/g; s/"/\\"/g; s/\n/\\n/g')
-    
-    # Format completed_at and error as JSON values
-    local completed_json
-    local error_json
-    if [[ "$completed_at" == "null" ]]; then
-        completed_json="null"
-    else
-        completed_json="\"$completed_at\""
-    fi
-    if [[ "$error" == "null" ]]; then
-        error_json="null"
-    else
-        # Escape error message for JSON
-        local escaped_error
-        escaped_error=$(echo -n "$error" | sed 's/\\/\\\\/g; s/"/\\"/g; s/\n/\\n/g' | head -c 1000)
-        error_json="\"$escaped_error\""
-    fi
-    
-    cat > "${dir}/status.json" << EOF
-{
-  "id": "${id}",
-  "query": "${escaped_query}",
-  "status": "${status}",
-  "started_at": "${started_at}",
-  "completed_at": ${completed_json},
-  "report_path": "${dir}/report.md",
-  "error": ${error_json}
-}
-EOF
-}
-
-#######################################
-# Extract report from scout output
-# Arguments:
-#   $1 - scout output file
-# Returns:
-#   Report content between markers, or empty if not found
-#######################################
-strip_ansi() {
-    # Comprehensive ANSI escape sequence stripping
-    perl -pe 's/\e\[[0-9;]*[a-zA-Z]//g; s/\e\][^\a]*\a//g; s/\e[()][AB012]//g'
-}
-
-extract_report() {
-    local output_file="$1"
-    local report
-    
-    # Use sed to extract content between markers
-    report=$(sed -n "/${REPORT_START_MARKER}/,/${REPORT_END_MARKER}/p" "$output_file" | \
-        sed "1d;\$d" | \
-        strip_ansi)  # Remove markers and strip ANSI codes
-    
-    if [[ -n "$report" ]]; then
-        echo "$report"
-        return 0
-    fi
-    
-    # Fallback: if no markers found, try to extract useful content from raw output
-    # Strip ANSI escape codes and g3 UI elements
-    report=$(cat "$output_file" | \
-        strip_ansi | \
-        grep -v '^🆕 Starting new session' | \
-        grep -v '^>> agent mode' | \
-        grep -v '^\[38;' | \
-        grep -v '^-> ~' | \
-        grep -v '^ *✓' | \
-        grep -v '^📝 Auto-memory:' | \
-        grep -v 'Auto-memory:' | \
-        grep -v '^$' | \
-        sed '/^[[:space:]]*$/d' | \
-        head -500)
-    
-    if [[ -n "$report" ]]; then
-        echo "$report"
-        return 0
-    fi
-}
-
-#######################################
-# Run research
-# Arguments:
-#   $1 - query
-#######################################
-run_research() {
-    local query="$1"
-    local id
-    local research_dir
-    local started_at
-    local output_file
-    local exit_code
-    
-    # Generate unique ID and create directory
-    id=$(generate_id)
-    research_dir="${RESEARCH_DIR}/${id}"
-    mkdir -p "$research_dir"
-    
-    started_at=$(get_timestamp)
-    output_file="${research_dir}/scout_output.txt"
-    
-    # Write initial status
-    write_status "$research_dir" "$id" "$query" "running" "$started_at" "null" "null"
-    
-    # Output the research ID immediately so caller knows where to look
-    echo "{\"id\": \"${id}\", \"status\": \"running\", \"path\": \"${research_dir}\"}"
-    
-    # Find g3 binary
-    local g3_bin
-    if command -v g3 &> /dev/null; then
-        g3_bin="g3"
-    elif [[ -x "./target/release/g3" ]]; then
-        g3_bin="./target/release/g3"
-    elif [[ -x "./target/debug/g3" ]]; then
-        g3_bin="./target/debug/g3"
-    else
-        write_status "$research_dir" "$id" "$query" "failed" "$started_at" "$(get_timestamp)" "g3 binary not found in PATH or target/"
-        echo "{\"id\": \"${id}\", \"status\": \"failed\", \"error\": \"g3 binary not found\"}" >&2
-        exit 1
-    fi
-    
-    # Run scout agent and capture output
-    set +e
-    "$g3_bin" --agent "$SCOUT_AGENT" --new-session --quiet "$query" > "$output_file" 2>&1
-    exit_code=$?
-    set -e
-    
-    local completed_at
-    completed_at=$(get_timestamp)
-    
-    if [[ $exit_code -ne 0 ]]; then
-        # Scout failed
-        local error_msg
-        error_msg=$(tail -20 "$output_file" 2>/dev/null || echo "Unknown error")
-        write_status "$research_dir" "$id" "$query" "failed" "$started_at" "$completed_at" "$error_msg"
-        echo "{\"id\": \"${id}\", \"status\": \"failed\", \"error\": \"Scout agent exited with code ${exit_code}\"}" >&2
-        exit 1
-    fi
-    
-    # Extract report from output
-    local report
-    report=$(extract_report "$output_file")
-    
-    if [[ -z "$report" ]]; then
-        write_status "$research_dir" "$id" "$query" "failed" "$started_at" "$completed_at" "Scout did not produce a valid report (missing markers)"
-        echo "{\"id\": \"${id}\", \"status\": \"failed\", \"error\": \"No report markers found in output\"}" >&2
-        exit 1
-    fi
-    
-    # Write report to file
-    echo "$report" > "${research_dir}/report.md"
-    
-    # Update status to complete
-    write_status "$research_dir" "$id" "$query" "complete" "$started_at" "$completed_at" "null"
-    
-    # Clean up scout output (optional - keep for debugging)
-    # rm -f "$output_file"
-    
-    echo "{\"id\": \"${id}\", \"status\": \"complete\", \"report_path\": \"${research_dir}/report.md\"}"
-}
-
-#######################################
-# Check status of a specific research task
-# Arguments:
-#   $1 - research ID
-#######################################
-check_status() {
-    local id="$1"
-    local status_file="${RESEARCH_DIR}/${id}/status.json"
-    
-    if [[ ! -f "$status_file" ]]; then
-        echo "{\"error\": \"Research task not found: ${id}\"}" >&2
-        exit 1
-    fi
-    
-    cat "$status_file"
-}
-
-#######################################
-# List all research tasks
-#######################################
-list_research() {
-    if [[ ! -d "$RESEARCH_DIR" ]]; then
-        echo "[]"
-        return
-    fi
-    
-    local first=true
-    echo "["
-    
-    for status_file in "${RESEARCH_DIR}"/*/status.json; do
-        if [[ ! -f "$status_file" ]]; then
-            continue
-        fi
-        
-        if [[ "$first" == true ]]; then
-            first=false
-        else
-            echo ","
-        fi
-        
-        cat "$status_file"
-    done
-    
-    echo "]"
-}
-
-#######################################
-# Show help
-#######################################
-show_help() {
-    cat << 'EOF'
-g3-research - Perform web research via scout agent
-
-USAGE:
-    g3-research "<query>"           Start new research
-    g3-research --status <id>       Check status of specific research
-    g3-research --list              List all research tasks
-    g3-research --help              Show this help
-
-EXAMPLES:
-    # Start research (run via background_process)
-    g3-research "What are the best Rust HTTP client libraries?"
-
-    # Check status
-    g3-research --status research_1738700000_a1b2c3
-
-    # List all research
-    g3-research --list
-
-OUTPUT:
-    All commands output JSON for machine parsing.
-    Research results are stored in .g3/research/<id>/
-
-FILES:
-    .g3/research/<id>/status.json   Machine-readable status
-    .g3/research/<id>/report.md     Research brief (when complete)
-EOF
-}
-
-#######################################
-# Main
-#######################################
-main() {
-    if [[ $# -eq 0 ]]; then
-        show_help
-        exit 1
-    fi
-    
-    case "$1" in
-        --help|-h)
-            show_help
-            ;;
-        --status)
-            if [[ $# -lt 2 ]]; then
-                echo "{\"error\": \"Missing research ID\"}" >&2
-                exit 1
-            fi
-            check_status "$2"
-            ;;
-        --list)
-            list_research
-            ;;
-        -*)
-            echo "{\"error\": \"Unknown option: $1\"}" >&2
-            exit 1
-            ;;
-        *)
-            # Treat as query
-            run_research "$1"
-            ;;
-    esac
-}
-
-main "$@"