Add research tool for web-based research via scout agent

New tool that spawns a scout agent to perform web research and return a structured research brief. The scout agent uses webdriver to browse the web and returns a decision-ready report. Changes: - Added 'research' tool definition (12 core tools total) - Added research tool dispatch in tool_dispatch.rs - Created tools/research.rs implementation: - Spawns 'g3 --agent scout <query>' as subprocess - Captures stdout and extracts last line (report file path) - Reads and returns the report file contents - Added exclude_research flag to ToolConfig - Scout agent (agent_name == 'scout') does NOT have access to research tool to prevent infinite recursion - Updated system prompts to describe when to use research tool - Added scout.md agent prompt with research brief output contract The research tool is preferred for complex research tasks (APIs, SDKs, libraries, approaches, bugs). WebDriver can still be used directly for simple lookups or fine-grained control.
2026-01-09 15:59:19 +11:00
parent de50726eeb
commit 33e5705fc3
7 changed files with 284 additions and 19 deletions
--- a/agents/scout.md
+++ b/agents/scout.md
@@ -0,0 +1,95 @@
+<!--
+tools: -research
+-->
+
+You are **Scout**. Your role is to perform **research** in support of a specific question, and return a **single, compact research brief** (1-page).
+
+You exist to compress external information into decision-ready form. You do **NOT** explore endlessly, brainstorm, or teach.
+
+---
+
+## Core Responsibilities
+
+- Research the given question using external sources (web, docs, repos, blogs, papers).
+- Identify **existing solutions, libraries, tools, patterns, or APIs** relevant to the question.
+- Surface **trade-offs, limitations, and sharp edges**.
+- Return a **bounded, human-readable brief** that can be acted on immediately.
+
+---
+
+## Output Contract (MANDATORY)
+
+You must return **one brief only**, no conversation. The brief must fit on one page and follow this structure:
+
+### Query
+One sentence describing what is being investigated.
+
+### Options
+3–8 concrete options maximum.  
+Each option includes:
+- What it is (1 line)
+- Why it exists / where it fits
+- Key pros
+- Key cons or limits
+
+### Trade-offs / Comparisons
+Short bullets comparing the options where it matters.
+
+### Recommendation (Optional)
+If one option is clearly dominant, state it.
+If not, say “No clear default.”
+
+### Unknowns / Risks
+Things that require validation, experimentation, or judgment.
+
+### Sources
+Links only (titles + URLs).  
+Brief quotes or snippets if relevant to decision making. No page dumps.
+
+Write this brief out to a temporary file and write out the full path of the filename as your VERY LAST LINE of output.
+
+---
+
+## Strict Constraints
+
+- **No raw webpage text** beyond short quoted fragments only as necessary.
+- **No code dumps** beyond tiny illustrative snippets.
+- **No repo writes.**
+- **No follow-up questions.**
+
+If the research report would exceed one page, **rank and discard** lower-value material.
+
+If nothing useful exists, say so explicitly and back this up with evidence.
+
+---
+
+## Research Style
+
+- Be pragmatic, not academic.
+- Prefer real-world usage, maturity, and sharp edges over novelty.
+- Treat hype skeptically.
+- Optimize for *your user* making a decision, not for completeness.
+
+You are allowed to say:
+> “This exists but is immature / fragile / not worth it.”
+
+---
+
+## Ephemerality
+
+Your output is **decision support**, not institutional knowledge.
+
+Do not assume it will be saved.
+Do not suggest documentation updates.
+Do not try to future-proof.
+
+---
+
+## Success Criteria
+
+You succeed if:
+- The reader can decide what to try or ignore in under 5 minutes.
+- The brief is calm, bounded, and opinionated where justified.
+- No context bloat is introduced.
+
+If nothing meets the bar, saying so is OK.
--- a/crates/g3-core/src/lib.rs
+++ b/crates/g3-core/src/lib.rs
@@ -782,12 +782,17 @@ impl<W: UiWriter> Agent<W> {
        let provider_name = provider.name().to_string();
        let _has_native_tool_calling = provider.has_native_tool_calling();
        let _supports_cache_control = provider.supports_cache_control();
+        // Check if we should exclude the research tool (scout agent to prevent recursion)
+        let exclude_research = self.agent_name.as_deref() == Some("scout");
        let tools = if provider.has_native_tool_calling() {
-            Some(tool_definitions::create_tool_definitions(
-                tool_definitions::ToolConfig::new(
+            let mut tool_config = tool_definitions::ToolConfig::new(
                    self.config.webdriver.enabled,
                    self.config.computer_control.enabled,
-                )))
+                );
+            if exclude_research {
+                tool_config = tool_config.with_research_excluded();
+            }
+            Some(tool_definitions::create_tool_definitions(tool_config))
        } else {
            None
        };
@@ -2200,11 +2205,15 @@ impl<W: UiWriter> Agent<W> {
                            // Ensure tools are included for native providers in subsequent iterations
                            let provider_for_tools = self.providers.get(None)?;
                            if provider_for_tools.has_native_tool_calling() {
-                                request.tools = Some(tool_definitions::create_tool_definitions(
-                                    tool_definitions::ToolConfig::new(
+                                let mut tool_config = tool_definitions::ToolConfig::new(
                                        self.config.webdriver.enabled,
                                        self.config.computer_control.enabled,
-                                    )));
+                                    );
+                                // Exclude research tool for scout agent to prevent recursion
+                                if self.agent_name.as_deref() == Some("scout") {
+                                    tool_config = tool_config.with_research_excluded();
+                                }
+                                request.tools = Some(tool_definitions::create_tool_definitions(tool_config));
                            }

                            // DO NOT add final_display_content to full_response here!
--- a/crates/g3-core/src/prompts.rs
+++ b/crates/g3-core/src/prompts.rs
@@ -114,6 +114,14 @@ If you create temporary files for verification or investigation, place these in

 When you need to look up documentation, search for resources, find data online, or simply search the web to complete your task, you have access to WebDriver browser automation tools.

+**Preferred: Use the `research` tool for complex research tasks:**
+- For researching APIs, SDKs, libraries, approaches, bugs, or any topic requiring web research
+- The `research` tool spawns a specialized research agent that browses the web and returns a concise, decision-ready report
+- Simply call `research` with a specific query describing what you need to know
+- The tool returns a structured brief with options, trade-offs, and recommendations
+
+**Alternative: Use WebDriver directly for simple lookups or when you need fine-grained control:**
+
 **How to use WebDriver for research:**
 1. Call `webdriver_start` to begin a browser session (runs Chrome headless by default - no visible window)
 2. Use `webdriver_navigate` to go to URLs (search engines, documentation sites, etc.)
@@ -220,6 +228,11 @@ Short description for providers without native calling specs:
       - \"context\": 3 (show surrounding lines),
       - \"json_style\": \"stream\" (for large results)

+- **research**: Perform web-based research and return a structured report
+  - Format: {\"tool\": \"research\", \"args\": {\"query\": \"your research question\"}}
+  - Example: {\"tool\": \"research\", \"args\": {\"query\": \"Best Rust HTTP client libraries for async/await\"}}
+  - Use for researching APIs, SDKs, libraries, approaches, bugs, or any topic requiring web research
+
 # Instructions

 1. Analyze the request and break down into smaller tasks if appropriate
--- a/crates/g3-core/src/tool_definitions.rs
+++ b/crates/g3-core/src/tool_definitions.rs
@@ -12,6 +12,7 @@ use serde_json::json;
 pub struct ToolConfig {
    pub webdriver: bool,
    pub computer_control: bool,
+    pub exclude_research: bool,
 }

 impl ToolConfig {
@@ -19,8 +20,16 @@ impl ToolConfig {
        Self {
            webdriver,
            computer_control,
+            exclude_research: false,
        }
    }
+
+    /// Create a config with the research tool excluded.
+    /// Used for scout agent to prevent recursion.
+    pub fn with_research_excluded(mut self) -> Self {
+        self.exclude_research = true;
+        self
+    }
 }

 /// Create tool definitions for native tool calling providers.
@@ -28,7 +37,7 @@ impl ToolConfig {
 /// Returns a vector of Tool definitions that describe the available tools
 /// and their input schemas.
 pub fn create_tool_definitions(config: ToolConfig) -> Vec<Tool> {
-    let mut tools = create_core_tools();
+    let mut tools = create_core_tools(config.exclude_research);

    if config.webdriver {
        tools.extend(create_webdriver_tools());
@@ -38,8 +47,8 @@ pub fn create_tool_definitions(config: ToolConfig) -> Vec<Tool> {
 }

 /// Create the core tools that are always available
-fn create_core_tools() -> Vec<Tool> {
-    vec![
+fn create_core_tools(exclude_research: bool) -> Vec<Tool> {
+    let mut tools = vec![
        Tool {
            name: "shell".to_string(),
            description: "Execute shell commands".to_string(),
@@ -243,7 +252,27 @@ fn create_core_tools() -> Vec<Tool> {
                "required": ["searches"]
            }),
        },
-    ]
+    ];
+
+    // Conditionally add the research tool (excluded for scout agent to prevent recursion)
+    if !exclude_research {
+        tools.push(Tool {
+            name: "research".to_string(),
+            description: "Perform web-based research on a topic and return a structured research brief. Use this tool when you need to research APIs, SDKs, libraries, approaches, bugs, documentation, or anything else that requires web-based research. The tool spawns a specialized research agent that browses the web and returns a concise, decision-ready report.".to_string(),
+            input_schema: json!({
+                "type": "object",
+                "properties": {
+                    "query": {
+                        "type": "string",
+                        "description": "The research question or topic to investigate. Be specific about what you need to know."
+                    }
+                },
+                "required": ["query"]
+            }),
+        });
+    }
+
+    tools
 }

 /// Create WebDriver browser automation tools
@@ -445,11 +474,11 @@ mod tests {

    #[test]
    fn test_core_tools_count() {
-        let tools = create_core_tools();
+        let tools = create_core_tools(false);
        // Should have the core tools: shell, background_process, read_file, read_image,
-        // write_file, str_replace, final_output, take_screenshot,
-        // todo_read, todo_write, code_coverage, code_search (11 total)
-        assert_eq!(tools.len(), 11);
+        // write_file, str_replace, take_screenshot,
+        // todo_read, todo_write, code_coverage, code_search, research (12 total)
+        assert_eq!(tools.len(), 12);
    }

    #[test]
@@ -463,24 +492,36 @@ mod tests {
    fn test_create_tool_definitions_core_only() {
        let config = ToolConfig::default();
        let tools = create_tool_definitions(config);
-        assert_eq!(tools.len(), 11);
+        assert_eq!(tools.len(), 12);
    }

    #[test]
    fn test_create_tool_definitions_all_enabled() {
        let config = ToolConfig::new(true, true);
        let tools = create_tool_definitions(config);
-        // 11 core + 15 webdriver = 26
-        assert_eq!(tools.len(), 26);
+        // 12 core + 15 webdriver = 27
+        assert_eq!(tools.len(), 27);
    }

    #[test]
    fn test_tool_has_required_fields() {
-        let tools = create_core_tools();
+        let tools = create_core_tools(false);
        for tool in tools {
            assert!(!tool.name.is_empty(), "Tool name should not be empty");
            assert!(!tool.description.is_empty(), "Tool description should not be empty");
            assert!(tool.input_schema.is_object(), "Tool input_schema should be an object");
        }
    }
+
+    #[test]
+    fn test_research_tool_excluded() {
+        let tools_with_research = create_core_tools(false);
+        let tools_without_research = create_core_tools(true);
+        
+        assert_eq!(tools_with_research.len(), 12);
+        assert_eq!(tools_without_research.len(), 11);
+        
+        assert!(tools_with_research.iter().any(|t| t.name == "research"));
+        assert!(!tools_without_research.iter().any(|t| t.name == "research"));
+    }
 }
--- a/crates/g3-core/src/tool_dispatch.rs
+++ b/crates/g3-core/src/tool_dispatch.rs
@@ -7,7 +7,7 @@ use anyhow::Result;
 use tracing::{debug, warn};

 use crate::tools::executor::ToolContext;
-use crate::tools::{file_ops, misc, shell, todo, webdriver};
+use crate::tools::{file_ops, misc, research, shell, todo, webdriver};
 use crate::ui_writer::UiWriter;
 use crate::ToolCall;

@@ -41,6 +41,9 @@ pub async fn dispatch_tool<W: UiWriter>(
        "code_coverage" => misc::execute_code_coverage(tool_call, ctx).await,
        "code_search" => misc::execute_code_search(tool_call, ctx).await,

+        // Research tool
+        "research" => research::execute_research(tool_call, ctx).await,
+
        // WebDriver tools
        "webdriver_start" => webdriver::execute_webdriver_start(tool_call, ctx).await,
        "webdriver_navigate" => webdriver::execute_webdriver_navigate(tool_call, ctx).await,
--- a/crates/g3-core/src/tools/mod.rs
+++ b/crates/g3-core/src/tools/mod.rs
@@ -7,10 +7,12 @@
 //! - `todo` - TODO list management
 //! - `webdriver` - Browser automation via WebDriver
 //! - `misc` - Other tools (screenshots, code search, etc.)
+//! - `research` - Web research via scout agent

 pub mod executor;
 pub mod file_ops;
 pub mod misc;
+pub mod research;
 pub mod shell;
 pub mod todo;
 pub mod webdriver;
--- a/crates/g3-core/src/tools/research.rs
+++ b/crates/g3-core/src/tools/research.rs
@@ -0,0 +1,102 @@
+//! Research tool: spawns a scout agent to perform web-based research.
+
+use anyhow::Result;
+use std::process::Stdio;
+use tokio::io::{AsyncBufReadExt, BufReader};
+use tokio::process::Command;
+use tracing::debug;
+
+use crate::ui_writer::UiWriter;
+use crate::ToolCall;
+
+use super::executor::ToolContext;
+
+/// Execute the research tool by spawning a scout agent.
+///
+/// This tool:
+/// 1. Spawns `g3 --agent scout` with the query
+/// 2. Captures stdout and extracts the last line (file path to report)
+/// 3. Reads the report file and returns its contents
+pub async fn execute_research<W: UiWriter>(
+    tool_call: &ToolCall,
+    ctx: &mut ToolContext<'_, W>,
+) -> Result<String> {
+    let query = tool_call
+        .args
+        .get("query")
+        .and_then(|v| v.as_str())
+        .ok_or_else(|| anyhow::anyhow!("Missing required 'query' parameter"))?;
+
+    debug!("Research tool called with query: {}", query);
+    ctx.ui_writer.print_tool_header("research", None);
+    ctx.ui_writer.print_tool_arg("query", query);
+    
+    // Find the g3 executable path
+    let g3_path = std::env::current_exe()
+        .unwrap_or_else(|_| std::path::PathBuf::from("g3"));
+
+    // Spawn the scout agent
+    let mut child = Command::new(&g3_path)
+        .arg("--agent")
+        .arg("scout")
+        .arg("--webdriver")  // Scout needs webdriver for web research
+        .arg("--new-session")  // Always start fresh for research
+        .arg("--quiet")  // Suppress log file creation
+        .arg(query)
+        .stdout(Stdio::piped())
+        .stderr(Stdio::piped())
+        .spawn()
+        .map_err(|e| anyhow::anyhow!("Failed to spawn scout agent: {}", e))?;
+
+    // Capture stdout to find the report file path
+    let stdout = child.stdout.take()
+        .ok_or_else(|| anyhow::anyhow!("Failed to capture scout agent stdout"))?;
+    
+    let mut reader = BufReader::new(stdout).lines();
+    let mut last_line = String::new();
+    
+    // Read all lines, keeping track of the last one
+    while let Some(line) = reader.next_line().await? {
+        debug!("Scout output: {}", line);
+        last_line = line;
+    }
+
+    // Wait for the process to complete
+    let status = child.wait().await
+        .map_err(|e| anyhow::anyhow!("Failed to wait for scout agent: {}", e))?;
+
+    if !status.success() {
+        return Ok(format!("❌ Scout agent failed with exit code: {:?}", status.code()));
+    }
+
+    // The last line should be the path to the report file
+    let report_path = last_line.trim();
+    
+    if report_path.is_empty() {
+        return Ok("❌ Scout agent did not output a report file path".to_string());
+    }
+
+    debug!("Report file path: {}", report_path);
+
+    // Expand tilde if present
+    let expanded_path = if report_path.starts_with('~') {
+        if let Ok(home) = std::env::var("HOME") {
+            std::path::PathBuf::from(home).join(&report_path[2..])  // Skip "~/"
+        } else {
+            std::path::PathBuf::from(report_path)
+        }
+    } else {
+        std::path::PathBuf::from(report_path)
+    };
+
+    // Read the report file
+    match std::fs::read_to_string(&expanded_path) {
+        Ok(content) => {
+            debug!("Report loaded: {} chars", content.len());
+            Ok(format!("📋 Research Report:\n\n{}", content))
+        }
+        Err(e) => {
+            Ok(format!("❌ Failed to read report file '{}': {}", report_path, e))
+        }
+    }
+}