Make research skill self-contained without external scripts

- Rewrite SKILL.md with inline instructions to spawn g3 --agent scout directly
- Extend read_file to handle embedded skill paths (<embedded:name>/SKILL.md)
- Remove scripts field from EmbeddedSkill struct (no longer needed)
- Delete extraction.rs module (was only for script extraction)
- Delete g3-research bash script
- Remove obsolete Async Research Tool section from workspace memory

Skills are now fully portable - they work when g3 is installed as a
binary without access to source files. Agents can read embedded skill
content via read_file with the special <embedded:...> path syntax.
This commit is contained in:
Dhanji R. Prasanna
2026-02-05 14:22:17 +11:00
parent 788debb93a
commit cff32bf0ba
7 changed files with 130 additions and 710 deletions

View File

@@ -5,115 +5,114 @@ license: Apache-2.0
compatibility: Requires g3 binary in PATH. WebDriver (Safari or Chrome) recommended for best results.
metadata:
author: g3
version: "1.0"
version: "2.0"
---
# Research Skill
Perform asynchronous web research without blocking your current work. Research runs in the background and saves results to disk for you to read when ready.
Perform asynchronous web research without blocking your current work. Research runs in the background and results are saved to disk.
## Quick Start
```bash
# Start research (ALWAYS use background_process, never blocking shell)
background_process("research-<topic>", ".g3/bin/g3-research 'Your research question here'")
# 1. Create research directory and status file
RESEARCH_ID="research_$(date +%s)_$(head -c 3 /dev/urandom | xxd -p)"
mkdir -p ".g3/research/$RESEARCH_ID"
echo '{"id":"'$RESEARCH_ID'","status":"running","query":"YOUR QUERY"}' > ".g3/research/$RESEARCH_ID/status.json"
# Check status
shell(".g3/bin/g3-research --status <research-id>")
# Or list all:
shell(".g3/bin/g3-research --list")
# 2. Start research in background
background_process("research-topic", "g3 --agent scout --new-session --quiet 'Your research question' > .g3/research/$RESEARCH_ID/report.md 2>&1 && sed -i '' 's/running/complete/' .g3/research/$RESEARCH_ID/status.json || sed -i '' 's/running/failed/' .g3/research/$RESEARCH_ID/status.json")
# Read the report when complete
read_file(".g3/research/<research-id>/report.md")
# 3. Check status
cat .g3/research/$RESEARCH_ID/status.json
# 4. Read report when complete
read_file(".g3/research/$RESEARCH_ID/report.md")
```
## How It Works
## Step-by-Step Instructions
1. **Start research** - The `g3-research` script spawns a scout agent that performs web research
2. **Background execution** - Research runs asynchronously; you can continue other work
3. **Filesystem handoff** - Results are written to `.g3/research/<id>/` with machine-readable status
4. **Read when ready** - Use `read_file` to load the report into context only when needed
### 1. Generate a Unique Research ID
Use shell to create a unique ID and directory:
```bash
shell("RESEARCH_ID=\"research_$(date +%s)_$(head -c 3 /dev/urandom | xxd -p)\" && mkdir -p \".g3/research/$RESEARCH_ID\" && echo $RESEARCH_ID")
```
Save the returned ID for later use.
### 2. Write Initial Status File
```bash
shell("echo '{\"id\":\"<RESEARCH_ID>\",\"status\":\"running\",\"query\":\"<YOUR_QUERY>\",\"started_at\":\"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'\"}' > .g3/research/<RESEARCH_ID>/status.json")
```
### 3. Start the Scout Agent
Use `background_process` to run the scout agent (NEVER use blocking `shell`):
```bash
background_process("research-<topic>", "g3 --agent scout --new-session --quiet '<Your detailed research question>' > .g3/research/<RESEARCH_ID>/report.md 2>&1; if [ $? -eq 0 ]; then sed -i '' 's/running/complete/' .g3/research/<RESEARCH_ID>/status.json; else sed -i '' 's/running/failed/' .g3/research/<RESEARCH_ID>/status.json; fi")
```
**Important flags:**
- `--agent scout` - Uses the scout agent optimized for web research
- `--new-session` - Starts a fresh session
- `--quiet` - Reduces UI noise in output
### 4. Check Research Status
```bash
shell("cat .g3/research/<RESEARCH_ID>/status.json")
```
Status values:
- `running` - Research in progress
- `complete` - Report ready to read
- `failed` - Error occurred
### 5. Read the Report
Once status is `complete`:
```bash
read_file(".g3/research/<RESEARCH_ID>/report.md")
```
## Directory Structure
```
.g3/research/
── research_1738700000_a1b2c3/
├── status.json # Machine-readable status
└── report.md # The research brief (when complete)
└── research_1738700100_d4e5f6/
├── status.json
└── report.md
── research_1738700000_a1b2c3/
├── status.json # Machine-readable status
└── report.md # The research brief (when complete)
```
## status.json Schema
```json
{
"id": "research_1738700000_a1b2c3",
"query": "What are the best Rust async runtimes?",
"status": "complete",
"started_at": "2026-02-04T12:00:00Z",
"completed_at": "2026-02-04T12:01:30Z",
"report_path": ".g3/research/research_1738700000_a1b2c3/report.md",
"error": null
}
```
**Status values:**
- `running` - Research in progress
- `complete` - Report ready to read
- `failed` - Error occurred (check `error` field)
## Commands
### Start Research
## Example: Complete Workflow
```bash
.g3/bin/g3-research "<query>"
```
# Step 1: Create research task
shell("RESEARCH_ID=\"research_$(date +%s)_$(head -c 3 /dev/urandom | xxd -p)\" && mkdir -p \".g3/research/$RESEARCH_ID\" && echo '{\"id\":\"'$RESEARCH_ID'\",\"status\":\"running\",\"query\":\"Rust async runtimes comparison\"}' > \".g3/research/$RESEARCH_ID/status.json\" && echo $RESEARCH_ID")
# Returns: research_1738700000_a1b2c3
Outputs the research ID and path on success. **Always run via `background_process`**, not `shell`.
# Step 2: Start scout in background
background_process("research-rust-async", "g3 --agent scout --new-session --quiet 'Compare Tokio vs async-std vs smol for Rust async runtimes. Include performance, ecosystem, and ease of use.' > .g3/research/research_1738700000_a1b2c3/report.md 2>&1; [ $? -eq 0 ] && sed -i '' 's/running/complete/' .g3/research/research_1738700000_a1b2c3/status.json || sed -i '' 's/running/failed/' .g3/research/research_1738700000_a1b2c3/status.json")
### Check Status
```bash
# Check specific research
.g3/bin/g3-research --status <research-id>
# List all research tasks
.g3/bin/g3-research --list
```
Outputs JSON for machine parsing.
### Read Report
Once status is `complete`, read the report:
```bash
read_file(".g3/research/<research-id>/report.md")
```
**Tip:** If the report is large, use partial reads:
```bash
read_file(".g3/research/<id>/report.md", start=0, end=2000)
```
## Example Workflow
```
# 1. Start research on async runtimes
background_process("research-async", ".g3/bin/g3-research 'Compare Tokio vs async-std vs smol for Rust async runtimes'")
# 2. Continue with other work while research runs...
# Step 3: Continue other work...
shell("cargo check")
# 3. Check if research is done
shell(".g3/bin/g3-research --list")
# Step 4: Check if done
shell("cat .g3/research/research_1738700000_a1b2c3/status.json")
# 4. Read the report
read_file(".g3/research/research_1738700000_abc123/report.md")
# Step 5: Read report
read_file(".g3/research/research_1738700000_a1b2c3/report.md")
```
## Listing All Research Tasks
```bash
shell("for f in .g3/research/*/status.json; do cat \"$f\" 2>/dev/null; echo; done")
```
## Best Practices
@@ -121,7 +120,7 @@ read_file(".g3/research/research_1738700000_abc123/report.md")
1. **Always use `background_process`** - Never run research with blocking `shell`
2. **Be specific** - Narrow queries get better results faster
3. **Read selectively** - Only load reports into context when you need them
4. **Check status first** - Don't try to read reports that aren't complete
4. **Check status first** - Don't try to read reports that aren't complete yet
## Troubleshooting
@@ -129,16 +128,11 @@ read_file(".g3/research/research_1738700000_abc123/report.md")
- Try a more specific query
- Complex topics may take 1-2 minutes
### WebDriver not available
### WebDriver not available
- Research will still work but may have limited web access
- Install Safari WebDriver or Chrome for best results
- The scout agent will fall back to shell-based methods
### Report is empty or failed
- Check `status.json` for error details
- Check status.json for the status
- Look at the report.md file for any error output
- The query may be too broad or the topic too obscure
## Notes
- Research results accumulate in `.g3/research/` - they are not auto-cleaned
- Each research task gets a unique ID based on timestamp
- Multiple concurrent research tasks are supported

View File

@@ -1,338 +0,0 @@
#!/bin/bash
#
# g3-research - Perform web research via scout agent with filesystem handoff
#
# Usage:
# g3-research "<query>" Start new research
# g3-research --status <id> Check status of specific research
# g3-research --list List all research tasks
# g3-research --help Show this help
#
# Research results are stored in .g3/research/<id>/
# - status.json: Machine-readable status
# - report.md: The research brief (when complete)
set -euo pipefail
# Configuration
RESEARCH_DIR=".g3/research"
SCOUT_AGENT="scout"
# Report markers (must match scout agent output)
REPORT_START_MARKER="---SCOUT_REPORT_START---"
REPORT_END_MARKER="---SCOUT_REPORT_END---"
#######################################
# Generate a unique research ID
#######################################
generate_id() {
local timestamp
local random_suffix
timestamp=$(date +%s)
random_suffix=$(head -c 6 /dev/urandom | xxd -p | head -c 6)
echo "research_${timestamp}_${random_suffix}"
}
#######################################
# Get current ISO 8601 timestamp
#######################################
get_timestamp() {
date -u +"%Y-%m-%dT%H:%M:%SZ"
}
#######################################
# Write status.json file
# Arguments:
# $1 - research directory
# $2 - id
# $3 - query
# $4 - status (running|complete|failed)
# $5 - started_at
# $6 - completed_at (optional, use "null" for running)
# $7 - error (optional, use "null" for success)
#######################################
write_status() {
local dir="$1"
local id="$2"
local query="$3"
local status="$4"
local started_at="$5"
local completed_at="$6"
local error="$7"
# Escape query for JSON (handle quotes and newlines)
local escaped_query
escaped_query=$(echo -n "$query" | sed 's/\\/\\\\/g; s/"/\\"/g; s/\n/\\n/g')
# Format completed_at and error as JSON values
local completed_json
local error_json
if [[ "$completed_at" == "null" ]]; then
completed_json="null"
else
completed_json="\"$completed_at\""
fi
if [[ "$error" == "null" ]]; then
error_json="null"
else
# Escape error message for JSON
local escaped_error
escaped_error=$(echo -n "$error" | sed 's/\\/\\\\/g; s/"/\\"/g; s/\n/\\n/g' | head -c 1000)
error_json="\"$escaped_error\""
fi
cat > "${dir}/status.json" << EOF
{
"id": "${id}",
"query": "${escaped_query}",
"status": "${status}",
"started_at": "${started_at}",
"completed_at": ${completed_json},
"report_path": "${dir}/report.md",
"error": ${error_json}
}
EOF
}
#######################################
# Extract report from scout output
# Arguments:
# $1 - scout output file
# Returns:
# Report content between markers, or empty if not found
#######################################
strip_ansi() {
# Comprehensive ANSI escape sequence stripping
perl -pe 's/\e\[[0-9;]*[a-zA-Z]//g; s/\e\][^\a]*\a//g; s/\e[()][AB012]//g'
}
extract_report() {
local output_file="$1"
local report
# Use sed to extract content between markers
report=$(sed -n "/${REPORT_START_MARKER}/,/${REPORT_END_MARKER}/p" "$output_file" | \
sed "1d;\$d" | \
strip_ansi) # Remove markers and strip ANSI codes
if [[ -n "$report" ]]; then
echo "$report"
return 0
fi
# Fallback: if no markers found, try to extract useful content from raw output
# Strip ANSI escape codes and g3 UI elements
report=$(cat "$output_file" | \
strip_ansi | \
grep -v '^🆕 Starting new session' | \
grep -v '^>> agent mode' | \
grep -v '^\[38;' | \
grep -v '^-> ~' | \
grep -v '^ *✓' | \
grep -v '^📝 Auto-memory:' | \
grep -v 'Auto-memory:' | \
grep -v '^$' | \
sed '/^[[:space:]]*$/d' | \
head -500)
if [[ -n "$report" ]]; then
echo "$report"
return 0
fi
}
#######################################
# Run research
# Arguments:
# $1 - query
#######################################
run_research() {
local query="$1"
local id
local research_dir
local started_at
local output_file
local exit_code
# Generate unique ID and create directory
id=$(generate_id)
research_dir="${RESEARCH_DIR}/${id}"
mkdir -p "$research_dir"
started_at=$(get_timestamp)
output_file="${research_dir}/scout_output.txt"
# Write initial status
write_status "$research_dir" "$id" "$query" "running" "$started_at" "null" "null"
# Output the research ID immediately so caller knows where to look
echo "{\"id\": \"${id}\", \"status\": \"running\", \"path\": \"${research_dir}\"}"
# Find g3 binary
local g3_bin
if command -v g3 &> /dev/null; then
g3_bin="g3"
elif [[ -x "./target/release/g3" ]]; then
g3_bin="./target/release/g3"
elif [[ -x "./target/debug/g3" ]]; then
g3_bin="./target/debug/g3"
else
write_status "$research_dir" "$id" "$query" "failed" "$started_at" "$(get_timestamp)" "g3 binary not found in PATH or target/"
echo "{\"id\": \"${id}\", \"status\": \"failed\", \"error\": \"g3 binary not found\"}" >&2
exit 1
fi
# Run scout agent and capture output
set +e
"$g3_bin" --agent "$SCOUT_AGENT" --new-session --quiet "$query" > "$output_file" 2>&1
exit_code=$?
set -e
local completed_at
completed_at=$(get_timestamp)
if [[ $exit_code -ne 0 ]]; then
# Scout failed
local error_msg
error_msg=$(tail -20 "$output_file" 2>/dev/null || echo "Unknown error")
write_status "$research_dir" "$id" "$query" "failed" "$started_at" "$completed_at" "$error_msg"
echo "{\"id\": \"${id}\", \"status\": \"failed\", \"error\": \"Scout agent exited with code ${exit_code}\"}" >&2
exit 1
fi
# Extract report from output
local report
report=$(extract_report "$output_file")
if [[ -z "$report" ]]; then
write_status "$research_dir" "$id" "$query" "failed" "$started_at" "$completed_at" "Scout did not produce a valid report (missing markers)"
echo "{\"id\": \"${id}\", \"status\": \"failed\", \"error\": \"No report markers found in output\"}" >&2
exit 1
fi
# Write report to file
echo "$report" > "${research_dir}/report.md"
# Update status to complete
write_status "$research_dir" "$id" "$query" "complete" "$started_at" "$completed_at" "null"
# Clean up scout output (optional - keep for debugging)
# rm -f "$output_file"
echo "{\"id\": \"${id}\", \"status\": \"complete\", \"report_path\": \"${research_dir}/report.md\"}"
}
#######################################
# Check status of a specific research task
# Arguments:
# $1 - research ID
#######################################
check_status() {
local id="$1"
local status_file="${RESEARCH_DIR}/${id}/status.json"
if [[ ! -f "$status_file" ]]; then
echo "{\"error\": \"Research task not found: ${id}\"}" >&2
exit 1
fi
cat "$status_file"
}
#######################################
# List all research tasks
#######################################
list_research() {
if [[ ! -d "$RESEARCH_DIR" ]]; then
echo "[]"
return
fi
local first=true
echo "["
for status_file in "${RESEARCH_DIR}"/*/status.json; do
if [[ ! -f "$status_file" ]]; then
continue
fi
if [[ "$first" == true ]]; then
first=false
else
echo ","
fi
cat "$status_file"
done
echo "]"
}
#######################################
# Show help
#######################################
show_help() {
cat << 'EOF'
g3-research - Perform web research via scout agent
USAGE:
g3-research "<query>" Start new research
g3-research --status <id> Check status of specific research
g3-research --list List all research tasks
g3-research --help Show this help
EXAMPLES:
# Start research (run via background_process)
g3-research "What are the best Rust HTTP client libraries?"
# Check status
g3-research --status research_1738700000_a1b2c3
# List all research
g3-research --list
OUTPUT:
All commands output JSON for machine parsing.
Research results are stored in .g3/research/<id>/
FILES:
.g3/research/<id>/status.json Machine-readable status
.g3/research/<id>/report.md Research brief (when complete)
EOF
}
#######################################
# Main
#######################################
main() {
if [[ $# -eq 0 ]]; then
show_help
exit 1
fi
case "$1" in
--help|-h)
show_help
;;
--status)
if [[ $# -lt 2 ]]; then
echo "{\"error\": \"Missing research ID\"}" >&2
exit 1
fi
check_status "$2"
;;
--list)
list_research
;;
-*)
echo "{\"error\": \"Unknown option: $1\"}" >&2
exit 1
;;
*)
# Treat as query
run_research "$1"
;;
esac
}
main "$@"