refactor: decouple rulespec from plan_write, read from analysis/rulespec.yaml
- Remove rulespec parameter from plan_write tool definition and execution - Remove rulespec compilation from plan_approve (no longer pre-compiles) - Remove write_rulespec, get_rulespec_path, format_rulespec_yaml/markdown from invariants.rs; read_rulespec() now takes &Path working dir - Remove save/load_compiled_rulespec, get_compiled_rulespec_path from datalog.rs - Update shadow_datalog_verify() to compile on-the-fly from analysis/rulespec.yaml, writing rulespec.compiled.dl and datalog_evaluation.txt to session dir - Remove rulespec display from plan_read output - Remove Invariants/Rulespec section from native.md system prompt - Remove rulespec from prompts.rs plan_write format and examples - Update existing tests to remove rulespec from plan_write calls - Add 3 integration tests for on-the-fly rulespec verification
This commit is contained in:
@@ -1,5 +1,5 @@
|
|||||||
# Workspace Memory
|
# Workspace Memory
|
||||||
> Updated: 2026-02-06T00:59:11Z | Size: 20.2k chars
|
> Updated: 2026-02-06T04:29:34Z | Size: 21.0k chars
|
||||||
|
|
||||||
### Remember Tool Wiring
|
### Remember Tool Wiring
|
||||||
- `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()`
|
- `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()`
|
||||||
@@ -363,4 +363,14 @@ Makes tool output responsive to terminal width - no line wrapping, with 4-char r
|
|||||||
|
|
||||||
**Datalog Flow**:
|
**Datalog Flow**:
|
||||||
1. `plan_approve` → `compile_rulespec()` → saves `rulespec.compiled.json`
|
1. `plan_approve` → `compile_rulespec()` → saves `rulespec.compiled.json`
|
||||||
2. `plan_verify` → `shadow_datalog_verify()` → loads compiled + envelope → `extract_facts()` → `execute_rules()` → `eprint!()` (shadow mode)
|
2. `plan_verify` → `shadow_datalog_verify()` → loads compiled + envelope → `extract_facts()` → `execute_rules()` → `eprint!()` (shadow mode)
|
||||||
|
|
||||||
|
### Rulespec Changes (2026-02-06)
|
||||||
|
- Rulespec is no longer generated on-the-fly during `plan_write` — it's now read from `analysis/rulespec.yaml` (checked-in, hand-crafted)
|
||||||
|
- `read_rulespec()` in `invariants.rs` now takes `&Path` (working_dir) instead of `&str` (session_id)
|
||||||
|
- `write_rulespec()`, `get_rulespec_path()`, `format_rulespec_yaml()`, `format_rulespec_markdown()` removed from `invariants.rs`
|
||||||
|
- `save_compiled_rulespec()`, `load_compiled_rulespec()`, `get_compiled_rulespec_path()` removed from `datalog.rs`
|
||||||
|
- `shadow_datalog_verify()` now compiles rulespec on-the-fly at verify time, writes `rulespec.compiled.dl` and `datalog_evaluation.txt` to session dir
|
||||||
|
- `plan_write` tool no longer accepts `rulespec` parameter
|
||||||
|
- `plan_approve` no longer compiles rulespec
|
||||||
|
- `format_verification_results()` now takes `working_dir: Option<&Path>` as third parameter
|
||||||
@@ -58,9 +58,8 @@ Short description for providers without native calling specs:
|
|||||||
- Example: {\"tool\": \"plan_read\", \"args\": {}}
|
- Example: {\"tool\": \"plan_read\", \"args\": {}}
|
||||||
|
|
||||||
- **plan_write**: Create or update the Plan with YAML content
|
- **plan_write**: Create or update the Plan with YAML content
|
||||||
- Format: {\"tool\": \"plan_write\", \"args\": {\"plan\": \"plan_id: my-plan\\nitems: [...]\", \"rulespec\": \"claims: [...]\\npredicates: [...]\"}}
|
- Format: {\"tool\": \"plan_write\", \"args\": {\"plan\": \"plan_id: my-plan\\nitems: [...]\"}}
|
||||||
- For NEW plans, rulespec is REQUIRED. For updates, it's optional.
|
- Example (new plan): {\"tool\": \"plan_write\", \"args\": {\"plan\": \"plan_id: feature-x\\nitems:\\n - id: I1\\n description: Add feature\\n state: todo\\n touches: [src/lib.rs]\\n checks:\\n happy: {desc: Works, target: lib}\\n negative:\\n - {desc: Errors, target: lib}\\n boundary:\\n - {desc: Edge, target: lib}\"}}
|
||||||
- Example (new plan): {\"tool\": \"plan_write\", \"args\": {\"plan\": \"plan_id: feature-x\\nitems:\\n - id: I1\\n description: Add feature\\n state: todo\\n touches: [src/lib.rs]\\n checks:\\n happy: {desc: Works, target: lib}\\n negative:\\n - {desc: Errors, target: lib}\\n boundary:\\n - {desc: Edge, target: lib}\", \"rulespec\": \"claims:\\n - name: feature\\n selector: feature.done\\npredicates:\\n - claim: feature\\n rule: exists\\n source: task_prompt\"}}
|
|
||||||
- Example (update): {\"tool\": \"plan_write\", \"args\": {\"plan\": \"plan_id: feature-x\\nitems:\\n - id: I1\\n state: done\\n evidence: [src/lib.rs:42]\\n notes: Implemented\"}}
|
- Example (update): {\"tool\": \"plan_write\", \"args\": {\"plan\": \"plan_id: feature-x\\nitems:\\n - id: I1\\n state: done\\n evidence: [src/lib.rs:42]\\n notes: Implemented\"}}
|
||||||
|
|
||||||
- **plan_approve**: Approve the current plan revision (called by user)
|
- **plan_approve**: Approve the current plan revision (called by user)
|
||||||
|
|||||||
@@ -192,17 +192,13 @@ fn create_core_tools() -> Vec<Tool> {
|
|||||||
|
|
||||||
tools.push(Tool {
|
tools.push(Tool {
|
||||||
name: "plan_write".to_string(),
|
name: "plan_write".to_string(),
|
||||||
description: "Create or update the Plan for this session. For NEW plans, you MUST provide both 'plan' and 'rulespec' arguments. The rulespec defines invariants (constraints that must/must not hold) extracted from the task and memory. For plan UPDATES, rulespec is optional.".to_string(),
|
description: "Create or update the Plan for this session. Provide the plan as YAML with plan_id, revision, and items array.".to_string(),
|
||||||
input_schema: json!({
|
input_schema: json!({
|
||||||
"type": "object",
|
"type": "object",
|
||||||
"properties": {
|
"properties": {
|
||||||
"plan": {
|
"plan": {
|
||||||
"type": "string",
|
"type": "string",
|
||||||
"description": "The plan as YAML. Must include plan_id and items array."
|
"description": "The plan as YAML. Must include plan_id and items array."
|
||||||
},
|
|
||||||
"rulespec": {
|
|
||||||
"type": "string",
|
|
||||||
"description": "The rulespec as YAML with claims and predicates. REQUIRED for new plans, optional for updates. Defines invariants from task_prompt and memory."
|
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"required": ["plan"]
|
"required": ["plan"]
|
||||||
|
|||||||
@@ -6,10 +6,10 @@
|
|||||||
//!
|
//!
|
||||||
//! ## Architecture
|
//! ## Architecture
|
||||||
//!
|
//!
|
||||||
//! 1. **Compilation Phase** (on plan_approve):
|
//! 1. **Compilation Phase** (on-the-fly at plan_verify):
|
||||||
//! - Parse rulespec claims and predicates
|
//! - Parse rulespec claims and predicates
|
||||||
//! - Generate datafrog relations and rules
|
//! - Generate datafrog relations and rules
|
||||||
//! - Store compiled representation for later execution
|
//! - Rulespec is read from `analysis/rulespec.yaml`
|
||||||
//!
|
//!
|
||||||
//! 2. **Execution Phase** (on plan_verify):
|
//! 2. **Execution Phase** (on plan_verify):
|
||||||
//! - Extract facts from action envelope using selectors
|
//! - Extract facts from action envelope using selectors
|
||||||
@@ -34,7 +34,6 @@ use datafrog::{Iteration, Relation};
|
|||||||
use serde::{Deserialize, Serialize};
|
use serde::{Deserialize, Serialize};
|
||||||
use serde_yaml::Value as YamlValue;
|
use serde_yaml::Value as YamlValue;
|
||||||
use std::collections::{HashMap, HashSet};
|
use std::collections::{HashMap, HashSet};
|
||||||
use std::path::PathBuf;
|
|
||||||
|
|
||||||
use super::invariants::{
|
use super::invariants::{
|
||||||
ActionEnvelope, InvariantSource, PredicateRule, Rulespec, Selector,
|
ActionEnvelope, InvariantSource, PredicateRule, Rulespec, Selector,
|
||||||
@@ -42,7 +41,6 @@ use super::invariants::{
|
|||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
use super::invariants::{Claim, Predicate};
|
use super::invariants::{Claim, Predicate};
|
||||||
|
|
||||||
use crate::paths::get_session_logs_dir;
|
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Compiled Datalog Representation
|
// Compiled Datalog Representation
|
||||||
@@ -537,33 +535,6 @@ fn evaluate_predicate_datalog(
|
|||||||
}
|
}
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Storage
|
|
||||||
// ============================================================================
|
|
||||||
|
|
||||||
/// Get the path to the compiled rulespec file for a session.
|
|
||||||
pub fn get_compiled_rulespec_path(session_id: &str) -> PathBuf {
|
|
||||||
get_session_logs_dir(session_id).join("rulespec.compiled.json")
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Save a compiled rulespec to disk.
|
|
||||||
pub fn save_compiled_rulespec(session_id: &str, compiled: &CompiledRulespec) -> Result<()> {
|
|
||||||
let path = get_compiled_rulespec_path(session_id);
|
|
||||||
let json = serde_json::to_string_pretty(compiled)?;
|
|
||||||
std::fs::write(&path, json)?;
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Load a compiled rulespec from disk.
|
|
||||||
pub fn load_compiled_rulespec(session_id: &str) -> Result<Option<CompiledRulespec>> {
|
|
||||||
let path = get_compiled_rulespec_path(session_id);
|
|
||||||
if !path.exists() {
|
|
||||||
return Ok(None);
|
|
||||||
}
|
|
||||||
let json = std::fs::read_to_string(&path)?;
|
|
||||||
let compiled: CompiledRulespec = serde_json::from_str(&json)?;
|
|
||||||
Ok(Some(compiled))
|
|
||||||
}
|
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Formatting
|
// Formatting
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
|
|||||||
@@ -4,15 +4,15 @@
|
|||||||
//! - **Rulespec**: Machine-readable invariants with claims and predicates
|
//! - **Rulespec**: Machine-readable invariants with claims and predicates
|
||||||
//! - **ActionEnvelope**: Evidence of work done (facts about completed work)
|
//! - **ActionEnvelope**: Evidence of work done (facts about completed work)
|
||||||
//!
|
//!
|
||||||
//! The rulespec is written as the penultimate step in a plan, and the
|
//! The rulespec is checked into `analysis/rulespec.yaml` and read at
|
||||||
//! action envelope is written as the final step. Together they enable
|
//! plan verification time. The action envelope is written per-session
|
||||||
//! verification that invariants extracted from the task prompt and
|
//! and verified against the rulespec.
|
||||||
//! workspace memory are satisfied by the completed work.
|
|
||||||
|
|
||||||
use anyhow::{anyhow, Result};
|
use anyhow::{anyhow, Result};
|
||||||
use serde::{Deserialize, Serialize};
|
use serde::{Deserialize, Serialize};
|
||||||
use serde_yaml::Value as YamlValue;
|
use serde_yaml::Value as YamlValue;
|
||||||
use std::collections::HashMap;
|
use std::collections::HashMap;
|
||||||
|
use std::path::Path;
|
||||||
use std::path::PathBuf;
|
use std::path::PathBuf;
|
||||||
|
|
||||||
use crate::paths::get_session_logs_dir;
|
use crate::paths::get_session_logs_dir;
|
||||||
@@ -685,19 +685,14 @@ fn yaml_to_display(value: &YamlValue) -> String {
|
|||||||
// File Storage
|
// File Storage
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
|
|
||||||
/// Get the path to the rulespec.yaml file for a session.
|
|
||||||
pub fn get_rulespec_path(session_id: &str) -> PathBuf {
|
|
||||||
get_session_logs_dir(session_id).join("rulespec.yaml")
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Get the path to the envelope.yaml file for a session.
|
/// Get the path to the envelope.yaml file for a session.
|
||||||
pub fn get_envelope_path(session_id: &str) -> PathBuf {
|
pub fn get_envelope_path(session_id: &str) -> PathBuf {
|
||||||
get_session_logs_dir(session_id).join("envelope.yaml")
|
get_session_logs_dir(session_id).join("envelope.yaml")
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Read a rulespec from the session's rulespec.yaml file.
|
/// Read a rulespec from `analysis/rulespec.yaml` relative to the working directory.
|
||||||
pub fn read_rulespec(session_id: &str) -> Result<Option<Rulespec>> {
|
pub fn read_rulespec(working_dir: &Path) -> Result<Option<Rulespec>> {
|
||||||
let path = get_rulespec_path(session_id);
|
let path = working_dir.join("analysis").join("rulespec.yaml");
|
||||||
if !path.exists() {
|
if !path.exists() {
|
||||||
return Ok(None);
|
return Ok(None);
|
||||||
}
|
}
|
||||||
@@ -707,16 +702,6 @@ pub fn read_rulespec(session_id: &str) -> Result<Option<Rulespec>> {
|
|||||||
Ok(Some(rulespec))
|
Ok(Some(rulespec))
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Write a rulespec to the session's rulespec.yaml file.
|
|
||||||
pub fn write_rulespec(session_id: &str, rulespec: &Rulespec) -> Result<()> {
|
|
||||||
rulespec.validate()?;
|
|
||||||
|
|
||||||
let path = get_rulespec_path(session_id);
|
|
||||||
let content = format_rulespec_yaml(rulespec);
|
|
||||||
std::fs::write(&path, content)?;
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Read an action envelope from the session's envelope.yaml file.
|
/// Read an action envelope from the session's envelope.yaml file.
|
||||||
pub fn read_envelope(session_id: &str) -> Result<Option<ActionEnvelope>> {
|
pub fn read_envelope(session_id: &str) -> Result<Option<ActionEnvelope>> {
|
||||||
let path = get_envelope_path(session_id);
|
let path = get_envelope_path(session_id);
|
||||||
@@ -737,19 +722,6 @@ pub fn write_envelope(session_id: &str, envelope: &ActionEnvelope) -> Result<()>
|
|||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Format a rulespec as pretty YAML with comments.
|
|
||||||
fn format_rulespec_yaml(rulespec: &Rulespec) -> String {
|
|
||||||
let mut output = String::new();
|
|
||||||
output.push_str("# Rulespec - Machine-readable invariants\n");
|
|
||||||
output.push_str("# Generated by g3 Plan Mode\n\n");
|
|
||||||
|
|
||||||
let yaml = serde_yaml::to_string(rulespec)
|
|
||||||
.unwrap_or_else(|_| "# Error serializing rulespec".to_string());
|
|
||||||
output.push_str(&yaml);
|
|
||||||
|
|
||||||
output
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Format an action envelope as pretty YAML with comments.
|
/// Format an action envelope as pretty YAML with comments.
|
||||||
fn format_envelope_yaml(envelope: &ActionEnvelope) -> String {
|
fn format_envelope_yaml(envelope: &ActionEnvelope) -> String {
|
||||||
let mut output = String::new();
|
let mut output = String::new();
|
||||||
@@ -903,77 +875,6 @@ pub fn format_evaluation_results(eval: &RulespecEvaluation) -> String {
|
|||||||
output
|
output
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Format a rulespec as human-readable markdown.
|
|
||||||
///
|
|
||||||
/// This produces a rich, readable format suitable for tool output,
|
|
||||||
/// not raw YAML.
|
|
||||||
pub fn format_rulespec_markdown(rulespec: &Rulespec) -> String {
|
|
||||||
let mut output = String::new();
|
|
||||||
|
|
||||||
output.push_str("\n");
|
|
||||||
output.push_str("### Invariants (Rulespec)\n\n");
|
|
||||||
|
|
||||||
if rulespec.claims.is_empty() && rulespec.predicates.is_empty() {
|
|
||||||
output.push_str("_No invariants defined._\n");
|
|
||||||
return output;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Group predicates by source
|
|
||||||
let task_predicates: Vec<_> = rulespec.predicates.iter()
|
|
||||||
.filter(|p| p.source == InvariantSource::TaskPrompt)
|
|
||||||
.collect();
|
|
||||||
let memory_predicates: Vec<_> = rulespec.predicates.iter()
|
|
||||||
.filter(|p| p.source == InvariantSource::Memory)
|
|
||||||
.collect();
|
|
||||||
|
|
||||||
// Build claim lookup for selector display
|
|
||||||
let claims: std::collections::HashMap<&str, &Claim> = rulespec.claims.iter()
|
|
||||||
.map(|c| (c.name.as_str(), c))
|
|
||||||
.collect();
|
|
||||||
|
|
||||||
// Format predicates from task prompt
|
|
||||||
if !task_predicates.is_empty() {
|
|
||||||
output.push_str("**From Task:**\n");
|
|
||||||
for pred in &task_predicates {
|
|
||||||
format_predicate_markdown(&mut output, pred, &claims);
|
|
||||||
}
|
|
||||||
output.push_str("\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
// Format predicates from memory
|
|
||||||
if !memory_predicates.is_empty() {
|
|
||||||
output.push_str("**From Memory:**\n");
|
|
||||||
for pred in &memory_predicates {
|
|
||||||
format_predicate_markdown(&mut output, pred, &claims);
|
|
||||||
}
|
|
||||||
output.push_str("\n");
|
|
||||||
}
|
|
||||||
|
|
||||||
output
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Format a single predicate as a markdown list item.
|
|
||||||
fn format_predicate_markdown(
|
|
||||||
output: &mut String,
|
|
||||||
pred: &Predicate,
|
|
||||||
claims: &std::collections::HashMap<&str, &Claim>,
|
|
||||||
) {
|
|
||||||
let selector = claims.get(pred.claim.as_str())
|
|
||||||
.map(|c| c.selector.as_str())
|
|
||||||
.unwrap_or(&pred.claim);
|
|
||||||
|
|
||||||
let value_str = match &pred.value {
|
|
||||||
Some(v) => format!(" `{}`", yaml_to_display(v)),
|
|
||||||
None => String::new(),
|
|
||||||
};
|
|
||||||
|
|
||||||
output.push_str(&format!("- `{}` **{}**{}\n", selector, pred.rule, value_str));
|
|
||||||
|
|
||||||
if let Some(notes) = &pred.notes {
|
|
||||||
output.push_str(&format!(" - _{}_\n", notes));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Format an action envelope as human-readable markdown.
|
/// Format an action envelope as human-readable markdown.
|
||||||
///
|
///
|
||||||
/// This produces a rich, readable format suitable for tool output,
|
/// This produces a rich, readable format suitable for tool output,
|
||||||
@@ -1416,57 +1317,7 @@ mod tests {
|
|||||||
|
|
||||||
// ========================================================================
|
// ========================================================================
|
||||||
// Format Rulespec Markdown Tests
|
// Format Rulespec Markdown Tests
|
||||||
// ========================================================================
|
// ========================================================================
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_format_rulespec_markdown_empty() {
|
|
||||||
let rulespec = Rulespec::new();
|
|
||||||
let output = format_rulespec_markdown(&rulespec);
|
|
||||||
|
|
||||||
assert!(output.contains("### Invariants (Rulespec)"));
|
|
||||||
assert!(output.contains("_No invariants defined._"));
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_format_rulespec_markdown_with_predicates() {
|
|
||||||
let mut rulespec = Rulespec::new();
|
|
||||||
rulespec.add_claim(Claim::new("caps", "csv_importer.capabilities"));
|
|
||||||
rulespec.add_predicate(
|
|
||||||
Predicate::new("caps", PredicateRule::Contains, InvariantSource::TaskPrompt)
|
|
||||||
.with_value(YamlValue::String("handle_tsv".to_string()))
|
|
||||||
.with_notes("User requested TSV support")
|
|
||||||
);
|
|
||||||
rulespec.add_predicate(
|
|
||||||
Predicate::new("caps", PredicateRule::Exists, InvariantSource::Memory)
|
|
||||||
);
|
|
||||||
|
|
||||||
let output = format_rulespec_markdown(&rulespec);
|
|
||||||
|
|
||||||
assert!(output.contains("### Invariants (Rulespec)"));
|
|
||||||
assert!(output.contains("**From Task:**"));
|
|
||||||
assert!(output.contains("**From Memory:**"));
|
|
||||||
assert!(output.contains("`csv_importer.capabilities`"));
|
|
||||||
assert!(output.contains("**contains**"));
|
|
||||||
assert!(output.contains("`handle_tsv`"));
|
|
||||||
assert!(output.contains("_User requested TSV support_"));
|
|
||||||
assert!(output.contains("**exists**"));
|
|
||||||
}
|
|
||||||
|
|
||||||
#[test]
|
|
||||||
fn test_format_rulespec_markdown_task_only() {
|
|
||||||
let mut rulespec = Rulespec::new();
|
|
||||||
rulespec.add_claim(Claim::new("test", "foo.bar"));
|
|
||||||
rulespec.add_predicate(
|
|
||||||
Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt)
|
|
||||||
);
|
|
||||||
|
|
||||||
let output = format_rulespec_markdown(&rulespec);
|
|
||||||
|
|
||||||
assert!(output.contains("**From Task:**"));
|
|
||||||
assert!(!output.contains("**From Memory:**"));
|
|
||||||
}
|
|
||||||
|
|
||||||
// ========================================================================
|
|
||||||
// Format Envelope Markdown Tests
|
// Format Envelope Markdown Tests
|
||||||
// ========================================================================
|
// ========================================================================
|
||||||
|
|
||||||
|
|||||||
@@ -20,9 +20,10 @@ use crate::ToolCall;
|
|||||||
|
|
||||||
use super::executor::ToolContext;
|
use super::executor::ToolContext;
|
||||||
|
|
||||||
use super::invariants::{format_envelope_markdown, format_rulespec_markdown, get_envelope_path, get_rulespec_path, read_envelope, read_rulespec, write_rulespec, Rulespec};
|
use std::path::Path;
|
||||||
use super::datalog::{compile_rulespec, save_compiled_rulespec, format_datalog_results};
|
use super::invariants::{format_envelope_markdown, get_envelope_path, read_envelope, read_rulespec};
|
||||||
use super::datalog::{load_compiled_rulespec, extract_facts, execute_rules};
|
use super::datalog::{compile_rulespec, format_datalog_results};
|
||||||
|
use super::datalog::{extract_facts, execute_rules};
|
||||||
|
|
||||||
// ============================================================================
|
// ============================================================================
|
||||||
// Plan Schema
|
// Plan Schema
|
||||||
@@ -713,22 +714,31 @@ pub fn plan_verify(plan: &Plan, working_dir: Option<&str>) -> PlanVerification {
|
|||||||
/// Shadow datalog verification - runs datalog rules and writes to evaluation file.
|
/// Shadow datalog verification - runs datalog rules and writes to evaluation file.
|
||||||
/// This is for dry-run/shadow testing - results are written to
|
/// This is for dry-run/shadow testing - results are written to
|
||||||
/// `.g3/sessions/<id>/datalog_evaluation.txt`, NOT injected into context window.
|
/// `.g3/sessions/<id>/datalog_evaluation.txt`, NOT injected into context window.
|
||||||
fn shadow_datalog_verify(session_id: &str) {
|
fn shadow_datalog_verify(session_id: &str, working_dir: &Path) {
|
||||||
// Load compiled rulespec
|
// Read rulespec from analysis/rulespec.yaml
|
||||||
let compiled = match load_compiled_rulespec(session_id) {
|
let rulespec = match read_rulespec(working_dir) {
|
||||||
Ok(Some(c)) => c,
|
Ok(Some(rs)) => rs,
|
||||||
Ok(None) => {
|
Ok(None) => {
|
||||||
eprintln!("\n⚠️ [SHADOW] No compiled rulespec found - skipping datalog verification");
|
eprintln!("\nℹ️ No analysis/rulespec.yaml found - skipping datalog verification");
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("\n⚠️ [SHADOW] Failed to load compiled rulespec: {}", e);
|
eprintln!("\n⚠️ Failed to read analysis/rulespec.yaml: {}", e);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Compile rulespec on-the-fly
|
||||||
|
let compiled = match compile_rulespec(&rulespec, "plan-verify", 0) {
|
||||||
|
Ok(c) => c,
|
||||||
|
Err(e) => {
|
||||||
|
eprintln!("\n⚠️ Failed to compile rulespec: {}", e);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
if compiled.is_empty() {
|
if compiled.is_empty() {
|
||||||
eprintln!("\n⚠️ [SHADOW] Compiled rulespec has no predicates - skipping datalog verification");
|
eprintln!("\nℹ️ Rulespec has no predicates - skipping datalog verification");
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -736,11 +746,11 @@ fn shadow_datalog_verify(session_id: &str) {
|
|||||||
let envelope = match read_envelope(session_id) {
|
let envelope = match read_envelope(session_id) {
|
||||||
Ok(Some(e)) => e,
|
Ok(Some(e)) => e,
|
||||||
Ok(None) => {
|
Ok(None) => {
|
||||||
eprintln!("\n⚠️ [SHADOW] No envelope found - skipping datalog verification");
|
eprintln!("\n⚠️ No envelope found - skipping datalog verification");
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("\n⚠️ [SHADOW] Failed to load envelope: {}", e);
|
eprintln!("\n⚠️ Failed to load envelope: {}", e);
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
@@ -754,11 +764,21 @@ fn shadow_datalog_verify(session_id: &str) {
|
|||||||
// Format results
|
// Format results
|
||||||
let output = format_datalog_results(&result);
|
let output = format_datalog_results(&result);
|
||||||
|
|
||||||
// Write to evaluation file (shadow mode - not in context window)
|
let session_dir = get_session_logs_dir(session_id);
|
||||||
let eval_path = get_session_logs_dir(session_id).join("datalog_evaluation.txt");
|
|
||||||
|
// Write compiled rules to .dl file
|
||||||
|
let dl_path = session_dir.join("rulespec.compiled.dl");
|
||||||
|
let compiled_yaml = serde_yaml::to_string(&compiled).unwrap_or_default();
|
||||||
|
if let Err(e) = std::fs::write(&dl_path, &compiled_yaml) {
|
||||||
|
eprintln!("⚠️ Failed to write compiled rules: {}", e);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Write evaluation report
|
||||||
|
let eval_path = session_dir.join("datalog_evaluation.txt");
|
||||||
match std::fs::write(&eval_path, &output) {
|
match std::fs::write(&eval_path, &output) {
|
||||||
Ok(_) => {
|
Ok(_) => {
|
||||||
eprintln!("📊 Datalog evaluation written to: {}", eval_path.display());
|
eprintln!("📊 Compiled rules: {}", dl_path.display());
|
||||||
|
eprintln!("📊 Evaluation report: {}", eval_path.display());
|
||||||
}
|
}
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("⚠️ Failed to write datalog evaluation: {}", e);
|
eprintln!("⚠️ Failed to write datalog evaluation: {}", e);
|
||||||
@@ -768,8 +788,8 @@ fn shadow_datalog_verify(session_id: &str) {
|
|||||||
|
|
||||||
/// Format verification results as a string for display.
|
/// Format verification results as a string for display.
|
||||||
/// Uses loud formatting for warnings and errors.
|
/// Uses loud formatting for warnings and errors.
|
||||||
/// If session_id is provided, also prints rulespec and envelope file locations.
|
/// If session_id is provided, also prints envelope file location and runs datalog verification.
|
||||||
pub fn format_verification_results(verification: &PlanVerification, session_id: Option<&str>) -> String {
|
pub fn format_verification_results(verification: &PlanVerification, session_id: Option<&str>, working_dir: Option<&Path>) -> String {
|
||||||
let mut output = String::new();
|
let mut output = String::new();
|
||||||
let (warnings, errors) = verification.count_issues();
|
let (warnings, errors) = verification.count_issues();
|
||||||
|
|
||||||
@@ -810,24 +830,22 @@ pub fn format_verification_results(verification: &PlanVerification, session_id:
|
|||||||
output.push_str("✅ VERIFICATION COMPLETE: All evidence validated\n");
|
output.push_str("✅ VERIFICATION COMPLETE: All evidence validated\n");
|
||||||
}
|
}
|
||||||
|
|
||||||
// Print rulespec and envelope locations if session_id provided
|
// Print envelope location and run datalog verification if session_id provided
|
||||||
if let Some(sid) = session_id {
|
if let Some(sid) = session_id {
|
||||||
output.push_str("\n");
|
output.push_str("\n");
|
||||||
output.push_str("📜 INVARIANTS\n");
|
output.push_str("📜 ARTIFACTS\n");
|
||||||
|
|
||||||
let rulespec_path = get_rulespec_path(sid);
|
|
||||||
let envelope_path = get_envelope_path(sid);
|
let envelope_path = get_envelope_path(sid);
|
||||||
|
|
||||||
let rulespec_status = if rulespec_path.exists() { "✅" } else { "⚠️ (not found)" };
|
|
||||||
let envelope_status = if envelope_path.exists() { "✅" } else { "⚠️ (not found)" };
|
let envelope_status = if envelope_path.exists() { "✅" } else { "⚠️ (not found)" };
|
||||||
|
|
||||||
output.push_str(&format!(" {} Rulespec: {}\n", rulespec_status, rulespec_path.display()));
|
|
||||||
output.push_str(&format!(" {} Envelope: {}\n", envelope_status, envelope_path.display()));
|
output.push_str(&format!(" {} Envelope: {}\n", envelope_status, envelope_path.display()));
|
||||||
|
|
||||||
output.push_str("\n");
|
output.push_str("\n");
|
||||||
|
|
||||||
// Shadow datalog verification - print to stderr, NOT included in tool output
|
// Shadow datalog verification - print to stderr, NOT included in tool output
|
||||||
shadow_datalog_verify(sid);
|
let effective_wd = working_dir
|
||||||
|
.map(|p| p.to_path_buf())
|
||||||
|
.unwrap_or_else(|| std::env::current_dir().unwrap_or_default());
|
||||||
|
shadow_datalog_verify(sid, &effective_wd);
|
||||||
}
|
}
|
||||||
|
|
||||||
output.push_str(&"═".repeat(60));
|
output.push_str(&"═".repeat(60));
|
||||||
@@ -867,12 +885,6 @@ pub async fn execute_plan_read<W: UiWriter>(
|
|||||||
yaml
|
yaml
|
||||||
);
|
);
|
||||||
|
|
||||||
// Append rulespec if present
|
|
||||||
match read_rulespec(session_id) {
|
|
||||||
Ok(Some(rulespec)) => output.push_str(&format_rulespec_markdown(&rulespec)),
|
|
||||||
_ => output.push_str("\n\n_No rulespec generated._\n"),
|
|
||||||
}
|
|
||||||
|
|
||||||
// Append envelope if present
|
// Append envelope if present
|
||||||
match read_envelope(session_id) {
|
match read_envelope(session_id) {
|
||||||
Ok(Some(envelope)) => output.push_str(&format_envelope_markdown(&envelope)),
|
Ok(Some(envelope)) => output.push_str(&format_envelope_markdown(&envelope)),
|
||||||
@@ -906,9 +918,6 @@ pub async fn execute_plan_write<W: UiWriter>(
|
|||||||
None => return Ok("❌ Missing 'plan' argument. Provide the plan as YAML.".to_string()),
|
None => return Ok("❌ Missing 'plan' argument. Provide the plan as YAML.".to_string()),
|
||||||
};
|
};
|
||||||
|
|
||||||
// Get optional rulespec content from args
|
|
||||||
let rulespec_yaml = tool_call.args.get("rulespec").and_then(|v| v.as_str());
|
|
||||||
|
|
||||||
// Parse the YAML
|
// Parse the YAML
|
||||||
let mut plan: Plan = match serde_yaml::from_str(plan_yaml) {
|
let mut plan: Plan = match serde_yaml::from_str(plan_yaml) {
|
||||||
Ok(p) => p,
|
Ok(p) => p,
|
||||||
@@ -917,44 +926,6 @@ pub async fn execute_plan_write<W: UiWriter>(
|
|||||||
|
|
||||||
// Load existing plan to check if this is a new plan or an update
|
// Load existing plan to check if this is a new plan or an update
|
||||||
let existing_plan = read_plan(session_id)?;
|
let existing_plan = read_plan(session_id)?;
|
||||||
let is_new_plan = existing_plan.is_none();
|
|
||||||
|
|
||||||
// For NEW plans, rulespec is REQUIRED
|
|
||||||
// This prevents the tautology problem where invariants are written after implementation
|
|
||||||
if is_new_plan && rulespec_yaml.is_none() {
|
|
||||||
return Ok("❌ Missing 'rulespec' argument. New plans MUST include a rulespec with invariants.\n\n\
|
|
||||||
The rulespec defines constraints that MUST or MUST NOT hold, extracted from:\n\
|
|
||||||
- task_prompt: What the user explicitly requires\n\
|
|
||||||
- memory: Persistent rules from workspace memory\n\n\
|
|
||||||
Example rulespec:\n\
|
|
||||||
```yaml\n\
|
|
||||||
claims:\n\
|
|
||||||
- name: feature_capabilities\n\
|
|
||||||
selector: \"feature.capabilities\"\n\
|
|
||||||
predicates:\n\
|
|
||||||
- claim: feature_capabilities\n\
|
|
||||||
rule: contains\n\
|
|
||||||
value: \"required_feature\"\n\
|
|
||||||
source: task_prompt\n\
|
|
||||||
notes: \"User explicitly requested this\"\n\
|
|
||||||
```".to_string());
|
|
||||||
}
|
|
||||||
|
|
||||||
// Parse and validate rulespec if provided
|
|
||||||
let rulespec: Option<Rulespec> = if let Some(yaml) = rulespec_yaml {
|
|
||||||
match serde_yaml::from_str(yaml) {
|
|
||||||
Ok(r) => {
|
|
||||||
let rs: Rulespec = r;
|
|
||||||
if let Err(e) = rs.validate() {
|
|
||||||
return Ok(format!("❌ Invalid rulespec: {}", e));
|
|
||||||
}
|
|
||||||
Some(rs)
|
|
||||||
}
|
|
||||||
Err(e) => return Ok(format!("❌ Invalid rulespec YAML: {}", e)),
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
None
|
|
||||||
};
|
|
||||||
|
|
||||||
if let Some(existing) = existing_plan {
|
if let Some(existing) = existing_plan {
|
||||||
// Preserve approved_revision from existing plan
|
// Preserve approved_revision from existing plan
|
||||||
@@ -992,25 +963,12 @@ pub async fn execute_plan_write<W: UiWriter>(
|
|||||||
return Ok(format!("❌ Failed to write plan: {}", e));
|
return Ok(format!("❌ Failed to write plan: {}", e));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Write the rulespec if provided (atomically with plan)
|
|
||||||
if let Some(ref rs) = rulespec {
|
|
||||||
if let Err(e) = write_rulespec(session_id, rs) {
|
|
||||||
return Ok(format!("❌ Failed to write rulespec: {}", e));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Display the plan in compact format
|
// Display the plan in compact format
|
||||||
let plan_path = get_plan_path(session_id);
|
let plan_path = get_plan_path(session_id);
|
||||||
let plan_path_str = plan_path.to_string_lossy().to_string();
|
let plan_path_str = plan_path.to_string_lossy().to_string();
|
||||||
let yaml = serde_yaml::to_string(&plan)?;
|
let yaml = serde_yaml::to_string(&plan)?;
|
||||||
ctx.ui_writer.print_plan_compact(Some(&yaml), Some(&plan_path_str), true);
|
ctx.ui_writer.print_plan_compact(Some(&yaml), Some(&plan_path_str), true);
|
||||||
|
|
||||||
// Format rulespec section - use provided rulespec or read from disk
|
|
||||||
let rulespec_section = match rulespec.as_ref().or(read_rulespec(session_id).ok().flatten().as_ref()) {
|
|
||||||
Some(rs) => format_rulespec_markdown(rs),
|
|
||||||
None => "\n_No rulespec defined._\n".to_string(),
|
|
||||||
};
|
|
||||||
|
|
||||||
// Read and format envelope if it exists
|
// Read and format envelope if it exists
|
||||||
let envelope_section = match read_envelope(session_id) {
|
let envelope_section = match read_envelope(session_id) {
|
||||||
Ok(Some(envelope)) => format_envelope_markdown(&envelope),
|
Ok(Some(envelope)) => format_envelope_markdown(&envelope),
|
||||||
@@ -1021,20 +979,18 @@ pub async fn execute_plan_write<W: UiWriter>(
|
|||||||
// Check if plan is now complete and trigger verification
|
// Check if plan is now complete and trigger verification
|
||||||
if plan.is_complete() && plan.is_approved() {
|
if plan.is_complete() && plan.is_approved() {
|
||||||
let verification = plan_verify(&plan, ctx.working_dir);
|
let verification = plan_verify(&plan, ctx.working_dir);
|
||||||
let verification_output = format_verification_results(&verification, ctx.session_id);
|
let verification_output = format_verification_results(&verification, ctx.session_id, ctx.working_dir.map(std::path::Path::new));
|
||||||
return Ok(format!(
|
return Ok(format!(
|
||||||
"✅ Plan updated: {}\n{}\n{}\n{}",
|
"✅ Plan updated: {}\n{}\n{}",
|
||||||
plan.status_summary(),
|
plan.status_summary(),
|
||||||
verification_output,
|
verification_output,
|
||||||
rulespec_section,
|
|
||||||
envelope_section
|
envelope_section
|
||||||
));
|
));
|
||||||
}
|
}
|
||||||
|
|
||||||
Ok(format!(
|
Ok(format!(
|
||||||
"✅ Plan updated: {}\n{}\n{}",
|
"✅ Plan updated: {}\n{}",
|
||||||
plan.status_summary(),
|
plan.status_summary(),
|
||||||
rulespec_section,
|
|
||||||
envelope_section
|
envelope_section
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -1068,43 +1024,14 @@ pub async fn execute_plan_approve<W: UiWriter>(
|
|||||||
// Approve the plan
|
// Approve the plan
|
||||||
plan.approve();
|
plan.approve();
|
||||||
|
|
||||||
// Compile rulespec to datalog on approval
|
|
||||||
let compile_message;
|
|
||||||
match read_rulespec(session_id) {
|
|
||||||
Ok(Some(rulespec)) => {
|
|
||||||
match compile_rulespec(&rulespec, &plan.plan_id, plan.revision) {
|
|
||||||
Ok(compiled) => {
|
|
||||||
if let Err(e) = save_compiled_rulespec(session_id, &compiled) {
|
|
||||||
compile_message = format!("\n⚠️ Failed to save compiled rulespec: {}", e);
|
|
||||||
} else {
|
|
||||||
compile_message = format!(
|
|
||||||
"\n📜 Compiled {} invariant(s) to datalog rules.",
|
|
||||||
compiled.predicates.len()
|
|
||||||
);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => {
|
|
||||||
compile_message = format!("\n⚠️ Failed to compile rulespec: {}", e);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Ok(None) => {
|
|
||||||
compile_message = "\n⚠️ No rulespec found - datalog verification will be skipped.".to_string();
|
|
||||||
}
|
|
||||||
Err(e) => {
|
|
||||||
compile_message = format!("\n⚠️ Failed to read rulespec: {}", e);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Write back
|
// Write back
|
||||||
if let Err(e) = write_plan(session_id, &plan) {
|
if let Err(e) = write_plan(session_id, &plan) {
|
||||||
return Ok(format!("❌ Failed to save approved plan: {}", e));
|
return Ok(format!("❌ Failed to save approved plan: {}", e));
|
||||||
}
|
}
|
||||||
|
|
||||||
Ok(format!(
|
Ok(format!(
|
||||||
"✅ Plan approved at revision {}. You may now begin implementation.{}",
|
"✅ Plan approved at revision {}. You may now begin implementation.",
|
||||||
plan.revision,
|
plan.revision
|
||||||
compile_message
|
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -622,14 +622,6 @@ items:
|
|||||||
- desc: Edge cases
|
- desc: Edge cases
|
||||||
target: test::module"#
|
target: test::module"#
|
||||||
,
|
,
|
||||||
"rulespec": r#"claims:
|
|
||||||
- name: test_feature
|
|
||||||
selector: test.done
|
|
||||||
predicates:
|
|
||||||
- claim: test_feature
|
|
||||||
rule: exists
|
|
||||||
source: task_prompt
|
|
||||||
notes: Test invariant"#
|
|
||||||
}),
|
}),
|
||||||
};
|
};
|
||||||
let write_result = agent.execute_tool(&write_call).await.unwrap();
|
let write_result = agent.execute_tool(&write_call).await.unwrap();
|
||||||
|
|||||||
@@ -425,14 +425,6 @@ items:
|
|||||||
boundary:
|
boundary:
|
||||||
- desc: Edge
|
- desc: Edge
|
||||||
target: test"#
|
target: test"#
|
||||||
,
|
|
||||||
"rulespec": r#"claims:
|
|
||||||
- name: test_feature
|
|
||||||
selector: test.done
|
|
||||||
predicates:
|
|
||||||
- claim: test_feature
|
|
||||||
rule: exists
|
|
||||||
source: task_prompt"#
|
|
||||||
}),
|
}),
|
||||||
);
|
);
|
||||||
|
|
||||||
@@ -487,14 +479,6 @@ items:
|
|||||||
happy: {desc: Works, target: test}
|
happy: {desc: Works, target: test}
|
||||||
negative: [{desc: Errors, target: test}]
|
negative: [{desc: Errors, target: test}]
|
||||||
boundary: [{desc: Edge, target: test}]"#
|
boundary: [{desc: Edge, target: test}]"#
|
||||||
,
|
|
||||||
"rulespec": r#"claims:
|
|
||||||
- name: approval_test
|
|
||||||
selector: test.approved
|
|
||||||
predicates:
|
|
||||||
- claim: approval_test
|
|
||||||
rule: exists
|
|
||||||
source: task_prompt"#
|
|
||||||
}),
|
}),
|
||||||
);
|
);
|
||||||
agent.execute_tool(&write_call).await.unwrap();
|
agent.execute_tool(&write_call).await.unwrap();
|
||||||
@@ -507,3 +491,214 @@ predicates:
|
|||||||
"Should approve plan: {}", result);
|
"Should approve plan: {}", result);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
// =============================================================================
|
||||||
|
// Test: plan_verify with analysis/rulespec.yaml datalog integration
|
||||||
|
// =============================================================================
|
||||||
|
|
||||||
|
mod plan_verify_datalog_integration {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
/// Helper: write a complete plan, approve it, and set up envelope.
|
||||||
|
/// Returns the actual session ID (which has a unique suffix).
|
||||||
|
async fn setup_complete_plan_with_envelope(
|
||||||
|
agent: &mut Agent<NullUiWriter>,
|
||||||
|
temp_dir: &TempDir,
|
||||||
|
description: &str,
|
||||||
|
) -> String {
|
||||||
|
agent.init_session_id_for_test(description);
|
||||||
|
let actual_session_id = agent.get_session_id().unwrap().to_string();
|
||||||
|
|
||||||
|
// Write a plan
|
||||||
|
let write_call = make_tool_call(
|
||||||
|
"plan_write",
|
||||||
|
serde_json::json!({
|
||||||
|
"plan": r#"plan_id: datalog-test
|
||||||
|
revision: 1
|
||||||
|
items:
|
||||||
|
- id: I1
|
||||||
|
description: Implement feature
|
||||||
|
state: todo
|
||||||
|
touches: ["src/lib.rs"]
|
||||||
|
checks:
|
||||||
|
happy: {desc: Works, target: lib}
|
||||||
|
negative: [{desc: Errors, target: lib}]
|
||||||
|
boundary: [{desc: Edge, target: lib}]"#
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
agent.execute_tool(&write_call).await.unwrap();
|
||||||
|
|
||||||
|
// Approve
|
||||||
|
let approve_call = make_tool_call("plan_approve", serde_json::json!({}));
|
||||||
|
agent.execute_tool(&approve_call).await.unwrap();
|
||||||
|
|
||||||
|
// Write envelope.yaml to session dir (using actual session ID)
|
||||||
|
let session_dir = temp_dir
|
||||||
|
.path()
|
||||||
|
.join(".g3")
|
||||||
|
.join("sessions")
|
||||||
|
.join(&actual_session_id);
|
||||||
|
fs::create_dir_all(&session_dir).unwrap();
|
||||||
|
fs::write(
|
||||||
|
session_dir.join("envelope.yaml"),
|
||||||
|
"facts:
|
||||||
|
feature:
|
||||||
|
done: true
|
||||||
|
capabilities: [handle_csv, handle_tsv]
|
||||||
|
file: src/lib.rs
|
||||||
|
",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
// Create a dummy evidence file
|
||||||
|
let src_dir = temp_dir.path().join("src");
|
||||||
|
fs::create_dir_all(&src_dir).unwrap();
|
||||||
|
fs::write(src_dir.join("lib.rs"), "// test file").unwrap();
|
||||||
|
|
||||||
|
actual_session_id
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Test: plan_verify compiles datalog rules on-the-fly from analysis/rulespec.yaml
|
||||||
|
/// and writes .dl + evaluation files to session dir
|
||||||
|
#[tokio::test]
|
||||||
|
#[serial]
|
||||||
|
async fn test_plan_verify_with_analysis_rulespec() {
|
||||||
|
let temp_dir = TempDir::new().unwrap();
|
||||||
|
let mut agent = create_test_agent(&temp_dir).await;
|
||||||
|
|
||||||
|
let session_id = setup_complete_plan_with_envelope(
|
||||||
|
&mut agent, &temp_dir, "datalog-rulespec-test"
|
||||||
|
).await;
|
||||||
|
|
||||||
|
// Write analysis/rulespec.yaml
|
||||||
|
let analysis_dir = temp_dir.path().join("analysis");
|
||||||
|
fs::create_dir_all(&analysis_dir).unwrap();
|
||||||
|
fs::write(
|
||||||
|
analysis_dir.join("rulespec.yaml"),
|
||||||
|
"claims:
|
||||||
|
- name: feature_done
|
||||||
|
selector: feature.done
|
||||||
|
predicates:
|
||||||
|
- claim: feature_done
|
||||||
|
rule: exists
|
||||||
|
source: task_prompt
|
||||||
|
notes: Feature must be marked done
|
||||||
|
",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
// Mark item done - this triggers plan_verify + shadow_datalog_verify
|
||||||
|
let done_call = make_tool_call(
|
||||||
|
"plan_write",
|
||||||
|
serde_json::json!({
|
||||||
|
"plan": "plan_id: datalog-test\nrevision: 2\nitems:\n - id: I1\n description: Implement feature\n state: done\n touches: [src/lib.rs]\n checks:\n happy: {desc: Works, target: lib}\n negative: [{desc: Errors, target: lib}]\n boundary: [{desc: Edge, target: lib}]\n evidence: [src/lib.rs:1]\n notes: Implemented the feature"
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
let result = agent.execute_tool(&done_call).await.unwrap();
|
||||||
|
assert!(result.contains("VERIFICATION"), "Should trigger verification: {}", result);
|
||||||
|
|
||||||
|
// Check that .dl and evaluation files were written to session dir
|
||||||
|
let session_dir = temp_dir
|
||||||
|
.path()
|
||||||
|
.join(".g3")
|
||||||
|
.join("sessions")
|
||||||
|
.join(&session_id);
|
||||||
|
let dl_path = session_dir.join("rulespec.compiled.dl");
|
||||||
|
let eval_path = session_dir.join("datalog_evaluation.txt");
|
||||||
|
|
||||||
|
assert!(dl_path.exists(), "Compiled .dl file should exist at {}", dl_path.display());
|
||||||
|
assert!(eval_path.exists(), "Evaluation report should exist at {}", eval_path.display());
|
||||||
|
|
||||||
|
// Verify evaluation content shows pass
|
||||||
|
let eval_content = fs::read_to_string(&eval_path).unwrap();
|
||||||
|
assert!(eval_content.contains("satisfied") || eval_content.contains("PASS"),
|
||||||
|
"Evaluation should show passing results: {}", eval_content);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Test: plan_verify works gracefully when analysis/rulespec.yaml is absent
|
||||||
|
#[tokio::test]
|
||||||
|
#[serial]
|
||||||
|
async fn test_plan_verify_without_rulespec() {
|
||||||
|
let temp_dir = TempDir::new().unwrap();
|
||||||
|
let mut agent = create_test_agent(&temp_dir).await;
|
||||||
|
|
||||||
|
let session_id = setup_complete_plan_with_envelope(
|
||||||
|
&mut agent, &temp_dir, "datalog-no-rulespec-test"
|
||||||
|
).await;
|
||||||
|
|
||||||
|
// Do NOT create analysis/rulespec.yaml
|
||||||
|
|
||||||
|
// Mark item done
|
||||||
|
let done_call = make_tool_call(
|
||||||
|
"plan_write",
|
||||||
|
serde_json::json!({
|
||||||
|
"plan": "plan_id: datalog-test\nrevision: 2\nitems:\n - id: I1\n description: Implement feature\n state: done\n touches: [src/lib.rs]\n checks:\n happy: {desc: Works, target: lib}\n negative: [{desc: Errors, target: lib}]\n boundary: [{desc: Edge, target: lib}]\n evidence: [src/lib.rs:1]\n notes: Implemented the feature"
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
let result = agent.execute_tool(&done_call).await.unwrap();
|
||||||
|
assert!(result.contains("VERIFICATION"), "Should still verify: {}", result);
|
||||||
|
|
||||||
|
// No .dl or evaluation files should exist
|
||||||
|
let session_dir = temp_dir
|
||||||
|
.path()
|
||||||
|
.join(".g3")
|
||||||
|
.join("sessions")
|
||||||
|
.join(&session_id);
|
||||||
|
assert!(!session_dir.join("rulespec.compiled.dl").exists(),
|
||||||
|
"No .dl file should exist without rulespec");
|
||||||
|
assert!(!session_dir.join("datalog_evaluation.txt").exists(),
|
||||||
|
"No evaluation file should exist without rulespec");
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Test: rulespec predicate that fails against envelope shows failure
|
||||||
|
#[tokio::test]
|
||||||
|
#[serial]
|
||||||
|
async fn test_plan_verify_rulespec_failure() {
|
||||||
|
let temp_dir = TempDir::new().unwrap();
|
||||||
|
let mut agent = create_test_agent(&temp_dir).await;
|
||||||
|
|
||||||
|
let session_id = setup_complete_plan_with_envelope(
|
||||||
|
&mut agent, &temp_dir, "datalog-fail-test"
|
||||||
|
).await;
|
||||||
|
|
||||||
|
// Write a rulespec that will FAIL (expects a fact that doesn't exist)
|
||||||
|
let analysis_dir = temp_dir.path().join("analysis");
|
||||||
|
fs::create_dir_all(&analysis_dir).unwrap();
|
||||||
|
fs::write(
|
||||||
|
analysis_dir.join("rulespec.yaml"),
|
||||||
|
"claims:
|
||||||
|
- name: missing_feature
|
||||||
|
selector: nonexistent.field
|
||||||
|
predicates:
|
||||||
|
- claim: missing_feature
|
||||||
|
rule: exists
|
||||||
|
source: task_prompt
|
||||||
|
notes: This field does not exist in the envelope
|
||||||
|
",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
// Mark item done
|
||||||
|
let done_call = make_tool_call(
|
||||||
|
"plan_write",
|
||||||
|
serde_json::json!({
|
||||||
|
"plan": "plan_id: datalog-test\nrevision: 2\nitems:\n - id: I1\n description: Implement feature\n state: done\n touches: [src/lib.rs]\n checks:\n happy: {desc: Works, target: lib}\n negative: [{desc: Errors, target: lib}]\n boundary: [{desc: Edge, target: lib}]\n evidence: [src/lib.rs:1]\n notes: Implemented the feature"
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
agent.execute_tool(&done_call).await.unwrap();
|
||||||
|
|
||||||
|
// Check evaluation file shows failure
|
||||||
|
let session_dir = temp_dir
|
||||||
|
.path()
|
||||||
|
.join(".g3")
|
||||||
|
.join("sessions")
|
||||||
|
.join(&session_id);
|
||||||
|
let eval_path = session_dir.join("datalog_evaluation.txt");
|
||||||
|
assert!(eval_path.exists(), "Evaluation report should exist");
|
||||||
|
|
||||||
|
let eval_content = fs::read_to_string(&eval_path).unwrap();
|
||||||
|
assert!(eval_content.contains("FAIL") || eval_content.contains("fail"),
|
||||||
|
"Evaluation should show failing results: {}", eval_content);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|||||||
@@ -19,7 +19,7 @@ Plan Mode is a cognitive forcing system that prevents:
|
|||||||
|
|
||||||
## Workflow
|
## Workflow
|
||||||
|
|
||||||
1. **Draft**: Call `plan_read` to check for existing plan, then `plan_write` with BOTH plan AND rulespec
|
1. **Draft**: Call `plan_read` to check for existing plan, then `plan_write` with the plan YAML
|
||||||
2. **Approval**: Ask user to approve before starting work ("'approve', or edit plan?"). In non-interactive mode (autonomous/one-shot), plans auto-approve on write.
|
2. **Approval**: Ask user to approve before starting work ("'approve', or edit plan?"). In non-interactive mode (autonomous/one-shot), plans auto-approve on write.
|
||||||
3. **Execute**: Implement items, updating plan with `plan_write` to mark progress
|
3. **Execute**: Implement items, updating plan with `plan_write` to mark progress
|
||||||
4. **Complete**: When all items are done/blocked, verification runs automatically
|
4. **Complete**: When all items are done/blocked, verification runs automatically
|
||||||
@@ -44,47 +44,10 @@ When drafting a plan, you MUST:
|
|||||||
- Keep items ~7 by default
|
- Keep items ~7 by default
|
||||||
- Commit to where the work will live (touches)
|
- Commit to where the work will live (touches)
|
||||||
- Provide all three checks (happy, negative, boundary)
|
- Provide all three checks (happy, negative, boundary)
|
||||||
- **Include rulespec with invariants** (required for new plans)
|
|
||||||
|
|
||||||
When updating a plan:
|
When updating a plan:
|
||||||
- Cannot remove items from an approved plan (mark as blocked instead)
|
- Cannot remove items from an approved plan (mark as blocked instead)
|
||||||
- Must provide evidence and notes when marking item as done
|
- Must provide evidence and notes when marking item as done
|
||||||
- Rulespec is optional for updates (already saved from initial creation)
|
|
||||||
|
|
||||||
## Invariants (Rulespec)
|
|
||||||
|
|
||||||
For all NEW plans, you MUST extract invariants and provide them as the `rulespec` argument to `plan_write`.
|
|
||||||
|
|
||||||
### What are Invariants?
|
|
||||||
|
|
||||||
Invariants are constraints that MUST or MUST NOT hold. Extract them from:
|
|
||||||
- **task_prompt**: What the user explicitly requires ("must support TSV", "must not break existing API")
|
|
||||||
- **memory**: Persistent rules from workspace memory ("must be Send + Sync", "must not block async runtime")
|
|
||||||
|
|
||||||
### Rulespec Structure
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
claims:
|
|
||||||
- name: csv_capabilities
|
|
||||||
selector: "csv_importer.capabilities"
|
|
||||||
|
|
||||||
predicates:
|
|
||||||
- claim: csv_capabilities
|
|
||||||
rule: contains
|
|
||||||
value: "handle_tsv"
|
|
||||||
source: task_prompt
|
|
||||||
notes: "User explicitly requested TSV support"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Predicate Rules
|
|
||||||
|
|
||||||
- `contains`: Array contains value, or string contains substring
|
|
||||||
- `equals`: Exact match
|
|
||||||
- `exists`: Value is present
|
|
||||||
- `not_exists`: Value is absent
|
|
||||||
- `min_length` / `max_length`: Array size constraints
|
|
||||||
- `greater_than` / `less_than`: Numeric comparisons
|
|
||||||
- `matches`: Regex pattern match
|
|
||||||
|
|
||||||
## Example Plan
|
## Example Plan
|
||||||
|
|
||||||
@@ -108,17 +71,6 @@ plan_write(
|
|||||||
- desc: Empty file yields empty import without error
|
- desc: Empty file yields empty import without error
|
||||||
target: import::csv
|
target: import::csv
|
||||||
",
|
",
|
||||||
rulespec: "
|
|
||||||
claims:
|
|
||||||
- name: csv_capabilities
|
|
||||||
selector: csv_importer.capabilities
|
|
||||||
predicates:
|
|
||||||
- claim: csv_capabilities
|
|
||||||
rule: contains
|
|
||||||
value: handle_tsv
|
|
||||||
source: task_prompt
|
|
||||||
notes: User explicitly requested TSV support
|
|
||||||
"
|
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -126,7 +78,7 @@ When marking done, add `evidence` and `notes` to the item.
|
|||||||
|
|
||||||
## Action Envelope
|
## Action Envelope
|
||||||
|
|
||||||
Before marking the last plan item done, write an `envelope.yaml` file with facts about completed work. The envelope captures what was actually built so it can be verified against the rulespec.
|
Before marking the last plan item done, write an `envelope.yaml` file with facts about completed work. The envelope captures what was actually built so it can be verified against invariants in `analysis/rulespec.yaml` if present.
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
facts:
|
facts:
|
||||||
@@ -141,10 +93,10 @@ facts:
|
|||||||
```
|
```
|
||||||
|
|
||||||
**Rules:**
|
**Rules:**
|
||||||
- Selectors in rulespec (e.g., `csv_importer.capabilities`) are evaluated against envelope facts
|
- Selectors in `analysis/rulespec.yaml` (e.g., `csv_importer.capabilities`) are evaluated against envelope facts
|
||||||
- Use dot notation for nested access: `api_changes.breaking`
|
- Use dot notation for nested access: `api_changes.breaking`
|
||||||
- Use `null` to explicitly assert absence (for `not_exists` predicates)
|
- Use `null` to explicitly assert absence (for `not_exists` predicates)
|
||||||
- The envelope is automatically verified against the rulespec when the plan completes
|
- The envelope is automatically verified against `analysis/rulespec.yaml` when the plan completes (if the file exists)
|
||||||
|
|
||||||
# Workspace Memory
|
# Workspace Memory
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user