Tighten system prompt and tool definitions
Prompt changes (native.md): - Remove duplicate 'Temporary files' section - Consolidate 'remember' instructions into single authoritative location - Remove motivational 'Benefits' list from Plan Mode - Add 'Code Search Tool Selection' guidance (code_search vs rg) Tool changes (tool_definitions.rs, tool_dispatch.rs): - Remove screenshot tool (webdriver_screenshot remains) - Remove coverage tool - Reduce plan_write description from 22 lines to 1 line - Update tool count tests (16 -> 14 core tools) Net result: ~6 lines removed from prompt, ~56 lines removed from tool definitions, clearer tool selection guidance added.
This commit is contained in:
@@ -166,42 +166,6 @@ fn create_core_tools(exclude_research: bool) -> Vec<Tool> {
|
|||||||
"required": ["file_path", "diff"]
|
"required": ["file_path", "diff"]
|
||||||
}),
|
}),
|
||||||
},
|
},
|
||||||
Tool {
|
|
||||||
name: "screenshot".to_string(),
|
|
||||||
description: "Capture a screenshot of a specific application window. You MUST specify the window_id parameter with the application name (e.g., 'Safari', 'Terminal', 'Google Chrome'). The tool will automatically use the native screencapture command with the application's window ID for a clean capture. Use list_windows first to identify available windows.".to_string(),
|
|
||||||
input_schema: json!({
|
|
||||||
"type": "object",
|
|
||||||
"properties": {
|
|
||||||
"path": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "Filename for the screenshot (e.g., 'safari.png'). If a relative path is provided, the screenshot will be saved to ~/tmp or $TMPDIR. Use an absolute path to save elsewhere."
|
|
||||||
},
|
|
||||||
"window_id": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "REQUIRED: Application name to capture (e.g., 'Safari', 'Terminal', 'Google Chrome'). The tool will capture the frontmost window of that application using its native window ID."
|
|
||||||
},
|
|
||||||
"region": {
|
|
||||||
"type": "object",
|
|
||||||
"properties": {
|
|
||||||
"x": {"type": "integer"},
|
|
||||||
"y": {"type": "integer"},
|
|
||||||
"width": {"type": "integer"},
|
|
||||||
"height": {"type": "integer"}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"required": ["path", "window_id"]
|
|
||||||
}),
|
|
||||||
},
|
|
||||||
Tool {
|
|
||||||
name: "coverage".to_string(),
|
|
||||||
description: "Generate a code coverage report for the entire workspace using cargo llvm-cov. This runs all tests with coverage instrumentation and returns a summary of coverage statistics. Requires llvm-tools-preview and cargo-llvm-cov to be installed (they will be auto-installed if missing).".to_string(),
|
|
||||||
input_schema: json!({
|
|
||||||
"type": "object",
|
|
||||||
"properties": {},
|
|
||||||
"required": []
|
|
||||||
}),
|
|
||||||
},
|
|
||||||
Tool {
|
Tool {
|
||||||
name: "code_search".to_string(),
|
name: "code_search".to_string(),
|
||||||
description: "Syntax-aware code search that understands code structure, not just text. Finds actual functions, classes, methods, and other code constructs - ignores matches in comments and strings. Much more accurate than grep for code searches. Supports batch searches (up to 20 parallel) with structured results and context lines. Languages: Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Racket. Uses tree-sitter query syntax.".to_string(),
|
description: "Syntax-aware code search that understands code structure, not just text. Finds actual functions, classes, methods, and other code constructs - ignores matches in comments and strings. Much more accurate than grep for code searches. Supports batch searches (up to 20 parallel) with structured results and context lines. Languages: Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Racket. Uses tree-sitter query syntax.".to_string(),
|
||||||
@@ -278,27 +242,7 @@ fn create_core_tools(exclude_research: bool) -> Vec<Tool> {
|
|||||||
|
|
||||||
tools.push(Tool {
|
tools.push(Tool {
|
||||||
name: "plan_write".to_string(),
|
name: "plan_write".to_string(),
|
||||||
description: r#"Create or update the Plan for this session. The plan must be provided as YAML with the following structure:
|
description: "Create or update the Plan for this session. Provide plan as YAML with plan_id and items array. See system prompt for full schema (items need: id, description, state, touches, checks with happy/negative/boundary). Evidence and notes required when marking done.".to_string(),
|
||||||
|
|
||||||
- plan_id: Unique identifier for the plan
|
|
||||||
- revision: Will be auto-incremented
|
|
||||||
- items: Array of plan items, each with:
|
|
||||||
- id: Stable identifier (e.g., "I1")
|
|
||||||
- description: What will be done
|
|
||||||
- state: todo | doing | done | blocked
|
|
||||||
- touches: Array of paths/modules affected
|
|
||||||
- checks:
|
|
||||||
happy: {desc, target} - Normal successful operation
|
|
||||||
negative: [{desc, target}, ...] - Error handling, invalid input (>=1 required)
|
|
||||||
boundary: [{desc, target}, ...] - Edge cases, limits (>=1 required)
|
|
||||||
- evidence: Array of file:line refs, test names (required when done)
|
|
||||||
- notes: Implementation explanation (required when done)
|
|
||||||
|
|
||||||
Rules:
|
|
||||||
- Keep items ≤ 7 by default
|
|
||||||
- All checks required: 1 happy, 1+ negative, 1+ boundary
|
|
||||||
- Cannot remove items from an approved plan (mark as blocked instead)
|
|
||||||
- Evidence and notes required when marking item as done"#.to_string(),
|
|
||||||
input_schema: json!({
|
input_schema: json!({
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
@@ -557,10 +501,10 @@ mod tests {
|
|||||||
fn test_core_tools_count() {
|
fn test_core_tools_count() {
|
||||||
let tools = create_core_tools(false);
|
let tools = create_core_tools(false);
|
||||||
// Core tools: shell, background_process, read_file, read_image,
|
// Core tools: shell, background_process, read_file, read_image,
|
||||||
// write_file, str_replace, screenshot, coverage, code_search,
|
// write_file, str_replace, code_search,
|
||||||
// research, research_status, remember, plan_read, plan_write, plan_approve
|
// research, research_status, remember, plan_read, plan_write, plan_approve
|
||||||
// (16 total - memory is auto-loaded, only remember tool needed)
|
// (14 total - memory is auto-loaded, only remember tool needed)
|
||||||
assert_eq!(tools.len(), 16);
|
assert_eq!(tools.len(), 14);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
@@ -574,15 +518,15 @@ mod tests {
|
|||||||
fn test_create_tool_definitions_core_only() {
|
fn test_create_tool_definitions_core_only() {
|
||||||
let config = ToolConfig::default();
|
let config = ToolConfig::default();
|
||||||
let tools = create_tool_definitions(config);
|
let tools = create_tool_definitions(config);
|
||||||
assert_eq!(tools.len(), 16);
|
assert_eq!(tools.len(), 14);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
fn test_create_tool_definitions_all_enabled() {
|
fn test_create_tool_definitions_all_enabled() {
|
||||||
let config = ToolConfig::new(true, true);
|
let config = ToolConfig::new(true, true);
|
||||||
let tools = create_tool_definitions(config);
|
let tools = create_tool_definitions(config);
|
||||||
// 16 core + 15 webdriver = 31
|
// 14 core + 15 webdriver = 29
|
||||||
assert_eq!(tools.len(), 31);
|
assert_eq!(tools.len(), 29);
|
||||||
}
|
}
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
@@ -600,8 +544,8 @@ mod tests {
|
|||||||
let tools_with_research = create_core_tools(false);
|
let tools_with_research = create_core_tools(false);
|
||||||
let tools_without_research = create_core_tools(true);
|
let tools_without_research = create_core_tools(true);
|
||||||
|
|
||||||
assert_eq!(tools_with_research.len(), 16);
|
assert_eq!(tools_with_research.len(), 14);
|
||||||
assert_eq!(tools_without_research.len(), 14); // research + research_status both excluded
|
assert_eq!(tools_without_research.len(), 12); // research + research_status both excluded
|
||||||
|
|
||||||
assert!(tools_with_research.iter().any(|t| t.name == "research"));
|
assert!(tools_with_research.iter().any(|t| t.name == "research"));
|
||||||
assert!(!tools_without_research.iter().any(|t| t.name == "research"));
|
assert!(!tools_without_research.iter().any(|t| t.name == "research"));
|
||||||
|
|||||||
@@ -38,8 +38,6 @@ pub async fn dispatch_tool<W: UiWriter>(
|
|||||||
"plan_approve" => plan::execute_plan_approve(tool_call, ctx).await,
|
"plan_approve" => plan::execute_plan_approve(tool_call, ctx).await,
|
||||||
|
|
||||||
// Miscellaneous tools
|
// Miscellaneous tools
|
||||||
"screenshot" => misc::execute_take_screenshot(tool_call, ctx).await,
|
|
||||||
"coverage" => misc::execute_code_coverage(tool_call, ctx).await,
|
|
||||||
"code_search" => misc::execute_code_search(tool_call, ctx).await,
|
"code_search" => misc::execute_code_search(tool_call, ctx).await,
|
||||||
|
|
||||||
// Research tool
|
// Research tool
|
||||||
|
|||||||
@@ -14,6 +14,13 @@ IMPORTANT: You must call tools to achieve goals. When you receive a request:
|
|||||||
For shell commands: Use the shell tool with the exact command needed. Always use `rg` (ripgrep) instead of `grep` - it's faster, has better defaults, and respects .gitignore. Avoid commands that produce a large amount of output, and consider piping those outputs to files. Example: If asked to list files, immediately call the shell tool with command parameter "ls".
|
For shell commands: Use the shell tool with the exact command needed. Always use `rg` (ripgrep) instead of `grep` - it's faster, has better defaults, and respects .gitignore. Avoid commands that produce a large amount of output, and consider piping those outputs to files. Example: If asked to list files, immediately call the shell tool with command parameter "ls".
|
||||||
If you create temporary files for verification, place these in a subdir named 'tmp'. Do NOT pollute the current dir.
|
If you create temporary files for verification, place these in a subdir named 'tmp'. Do NOT pollute the current dir.
|
||||||
|
|
||||||
|
# Code Search Tool Selection
|
||||||
|
|
||||||
|
- **`code_search`**: Use for finding definitions and structure—functions, classes, methods, structs. Syntax-aware (ignores matches in comments/strings). Best for "where is X defined?" or "find all implementations of Y".
|
||||||
|
- **`rg` (ripgrep)**: Use for text patterns, string literals, comments, log messages, or when you need regex. Best for "find all uses of this error message" or "grep for TODO".
|
||||||
|
|
||||||
|
When in doubt: `code_search` for definitions, `rg` for text.
|
||||||
|
|
||||||
# Task Management with Plan Mode
|
# Task Management with Plan Mode
|
||||||
|
|
||||||
**REQUIRED for all tasks.**
|
**REQUIRED for all tasks.**
|
||||||
@@ -30,7 +37,6 @@ Plan Mode is a cognitive forcing system that prevents:
|
|||||||
2. **Approval**: Ask user to approve before starting work ("'approve', or edit plan?"). In non-interactive mode (autonomous/one-shot), plans auto-approve on write.
|
2. **Approval**: Ask user to approve before starting work ("'approve', or edit plan?"). In non-interactive mode (autonomous/one-shot), plans auto-approve on write.
|
||||||
3. **Execute**: Implement items, updating plan with `plan_write` to mark progress
|
3. **Execute**: Implement items, updating plan with `plan_write` to mark progress
|
||||||
4. **Complete**: When all items are done/blocked, verification runs automatically
|
4. **Complete**: When all items are done/blocked, verification runs automatically
|
||||||
5. **Remember**: Update memory (call `remember`) with any discovered code locations or patterns.
|
|
||||||
|
|
||||||
## Plan Schema
|
## Plan Schema
|
||||||
|
|
||||||
@@ -152,18 +158,6 @@ facts:
|
|||||||
3. After all work is complete, write `envelope.yaml` with facts about the completed work
|
3. After all work is complete, write `envelope.yaml` with facts about the completed work
|
||||||
4. **THEN** call `plan_write` to mark the final item done - verification will check both files
|
4. **THEN** call `plan_write` to mark the final item done - verification will check both files
|
||||||
|
|
||||||
## Benefits
|
|
||||||
|
|
||||||
✓ Prevents missed steps
|
|
||||||
✓ Makes progress visible
|
|
||||||
✓ Helps recover from interruptions
|
|
||||||
✓ Forces consideration of edge cases
|
|
||||||
✓ Provides audit trail with evidence
|
|
||||||
|
|
||||||
# Temporary files
|
|
||||||
|
|
||||||
If you create temporary files for verification or investigation, place these in a subdir named 'tmp'. Do NOT pollute the current dir.
|
|
||||||
|
|
||||||
# Web Research
|
# Web Research
|
||||||
|
|
||||||
When you need to look up documentation, search for resources, find data online, or research a topic to complete your task, use the `research` tool. **Research is asynchronous** - it runs in the background while you continue working.
|
When you need to look up documentation, search for resources, find data online, or research a topic to complete your task, use the `research` tool. **Research is asynchronous** - it runs in the background while you continue working.
|
||||||
@@ -221,5 +215,5 @@ This applies whenever you use search tools like `code_search`, `rg`, `grep`, `fi
|
|||||||
|
|
||||||
- Use Markdown formatting for all responses except tool calls.
|
- Use Markdown formatting for all responses except tool calls.
|
||||||
- Whenever taking actions, use the pronoun 'I'
|
- Whenever taking actions, use the pronoun 'I'
|
||||||
- When you discover features, patterns and code locations, call `remember` to save them.
|
- Call `remember` at end of turn if you discovered code locations (see Workspace Memory section).
|
||||||
- When showing example tool call JSON in prose or code blocks, use the fullwidth left curly bracket `{` (U+FF5B) instead of `{` to prevent parser confusion.
|
- When showing example tool call JSON in prose or code blocks, use the fullwidth left curly bracket `{` (U+FF5B) instead of `{` to prevent parser confusion.
|
||||||
|
|||||||
Reference in New Issue
Block a user