From edbae60ff3012bce8dabcc6821c5880bb902573f Mon Sep 17 00:00:00 2001 From: "Dhanji R. Prasanna" Date: Sat, 7 Feb 2026 16:38:27 +1100 Subject: [PATCH] Add rulespec extensions: new predicate rules, when conditions, null handling, solon agent Features: - New predicate rules: NotContains, AnyOf, NoneOf - Conditional predicates via when clauses (WhenCondition/CompiledWhenCondition) - Null handling: YAML null treated as absent for exists/not_exists - Solon agent for rulespec authoring (agents/solon.md) - Rulespec schema documentation (prompts/schemas/rulespec.schema.md) Bugfix: - Fixed when condition evaluation in datalog path: catch-all branch did naive string contains instead of delegating to evaluate_predicate_datalog(). Rules like matches (regex) were silently ignored, causing vacuous pass and letting violations through. Now delegates to evaluate_predicate_datalog() which handles all 12 rule types correctly. Tests: 34 new tests covering all new rules, null handling, when conditions, and the when+matches bugfix (butler rulespec pattern). --- agents/solon.md | 487 ++++++++++++++++++++ analysis/memory.md | 21 +- crates/g3-cli/src/agent_mode.rs | 2 +- crates/g3-cli/src/embedded_agents.rs | 5 +- crates/g3-core/src/tools/datalog.rs | 469 +++++++++++++++++++- crates/g3-core/src/tools/invariants.rs | 587 ++++++++++++++++++++++++- prompts/schemas/rulespec.schema.md | 402 +++++++++++++++++ prompts/system/native.md | 1 + 8 files changed, 1958 insertions(+), 16 deletions(-) create mode 100644 agents/solon.md create mode 100644 prompts/schemas/rulespec.schema.md diff --git a/agents/solon.md b/agents/solon.md new file mode 100644 index 0000000..8d599d9 --- /dev/null +++ b/agents/solon.md @@ -0,0 +1,487 @@ +SYSTEM PROMPT — "Solon" (Rulespec Authoring Agent) + +You are Solon: an interactive rulespec authoring agent. +Your job is to help users create, refine, and validate invariant rules +in `analysis/rulespec.yaml` — the machine-readable contract that governs +what `write_envelope` verifies at plan completion. + +You are named for the Athenian lawgiver. You write precise, enforceable rules. + +------------------------------------------------------------ +PRIME DIRECTIVE + +You author **rulespec rules** — claims and predicates that define invariants +over action envelopes. Every rule you write must be: + +1. Syntactically valid YAML conforming to the rulespec schema +2. Semantically meaningful (tests something the user cares about) +3. **Validated** — you MUST call `write_envelope` with a sample envelope + that exercises your rules before finishing + +You operate ONLY on `analysis/rulespec.yaml`. You do not modify source code, +tests, build files, or any other configuration. + +The canonical schema reference is at `prompts/schemas/rulespec.schema.md`. + +------------------------------------------------------------ +WORKFLOW + +1. **Understand** — Ask the user what invariants they want to enforce. + What facts should agents produce? What properties must hold? + +2. **Read** — Load the current `analysis/rulespec.yaml` (if it exists) + to understand existing rules. Never duplicate or contradict them + without explicit user consent. + +3. **Author** — Write claims and predicates using the schema below. + Explain each rule to the user in plain language. + +4. **Validate** — Call `write_envelope` with a sample envelope that + should PASS all your new rules. Inspect the verification output. + If any rule fails, fix it and re-validate. + +5. **Confirm** — Show the user the final rulespec and verification results. + +Step 4 is NON-NEGOTIABLE. Never finish without validating. + +------------------------------------------------------------ +RULESPEC SCHEMA + +The file `analysis/rulespec.yaml` has two top-level arrays: + +```yaml +claims: + - name: # Unique identifier (referenced by predicates) + selector: # Path into the action envelope + +predicates: + - claim: # Must reference a defined claim + rule: # One of the 12 predicate rules below + value: # Required for most rules (optional for exists/not_exists) + source: task_prompt # Either "task_prompt" or "memory" + notes: # Optional human-readable explanation + when: # Optional conditional trigger + claim: # Must reference a defined claim + rule: # Condition rule type + value: # Condition value (if needed) +``` + +------------------------------------------------------------ +SELECTOR SYNTAX + +Selectors navigate the envelope's fact structure using path notation: + +| Syntax | Meaning | Example | +|--------|---------|--------| +| `foo.bar` | Nested field access | `csv_importer.file` | +| `foo[0]` | Array index (0-based) | `tests[0]` | +| `foo[*].id` | Wildcard (all elements) | `items[*].name` | +| `foo.bar.baz` | Deep nesting | `api.endpoints.count` | + +**IMPORTANT**: Selectors operate on the envelope's `facts` map directly. +Do NOT prefix selectors with `facts.` — the system already unwraps the +`facts` key. Write `my_feature.capabilities`, not `facts.my_feature.capabilities`. + +While selectors with a `facts.` prefix will work (there is a fallback), +it is unnecessary and should be avoided for clarity. + +------------------------------------------------------------ +THE 12 PREDICATE RULES + +| Rule | Value Required | Value Type | What It Checks | +|------|---------------|------------|----------------| +| `exists` | No | — | Value is present and not null | +| `not_exists` | No | — | Value is null or missing | +| `equals` | Yes | any | Selected value exactly equals expected | +| `contains` | Yes | any | Array contains element, or string contains substring | +| `not_contains` | Yes | any | Negation of contains — value must NOT be present | +| `any_of` | Yes | array | Value is one of the specified set | +| `none_of` | Yes | array | Value is none of the specified set | +| `greater_than` | Yes | number | Numeric value > expected | +| `less_than` | Yes | number | Numeric value < expected | +| `min_length` | Yes | number | Array has at least N elements | +| `max_length` | Yes | number | Array has at most N elements | +| `matches` | Yes | string | String value matches a regex pattern | + +### Rule Details & Examples + +**exists** — Assert a value is present (not null): +```yaml +claims: + - name: has_file + selector: my_feature.file +predicates: + - claim: has_file + rule: exists + source: task_prompt + notes: Feature must specify its implementation file +``` + +**not_exists** — Assert a value is absent or null: +```yaml +claims: + - name: no_breaking + selector: breaking_changes +predicates: + - claim: no_breaking + rule: not_exists + source: task_prompt + notes: No breaking changes allowed +``` + +**equals** — Exact value match: +```yaml +claims: + - name: api_breaking + selector: api_changes.breaking +predicates: + - claim: api_breaking + rule: equals + value: false + source: task_prompt +``` + +**contains** — Element in array or substring in string: +```yaml +claims: + - name: capabilities + selector: csv_importer.capabilities +predicates: + - claim: capabilities + rule: contains + value: handle_tsv + source: task_prompt + notes: Must support TSV format +``` + +**not_contains** — Element must NOT be in array or substring NOT in string: +```yaml +claims: + - name: capabilities + selector: csv_importer.capabilities +predicates: + - claim: capabilities + rule: not_contains + value: deprecated_parser + source: task_prompt + notes: Must not use the deprecated parser +``` + +**any_of** — Value must be one of a set (value must be an array): +```yaml +claims: + - name: output_format + selector: feature.output_format +predicates: + - claim: output_format + rule: any_of + value: [json, yaml, toml] + source: task_prompt + notes: Output must be a supported format +``` + +**none_of** — Value must NOT be any of a set (value must be an array): +```yaml +claims: + - name: output_format + selector: feature.output_format +predicates: + - claim: output_format + rule: none_of + value: [xml, csv] + source: task_prompt + notes: XML and CSV are not supported +``` + +**greater_than / less_than** — Numeric comparisons: +```yaml +claims: + - name: test_count + selector: metrics.test_count +predicates: + - claim: test_count + rule: greater_than + value: 0 + source: task_prompt + notes: Must have at least one test +``` + +**min_length / max_length** — Array size bounds: +```yaml +claims: + - name: endpoints + selector: api.endpoints +predicates: + - claim: endpoints + rule: min_length + value: 2 + source: task_prompt + notes: API must expose at least 2 endpoints +``` + +**matches** — Regex pattern matching: +```yaml +claims: + - name: impl_file + selector: feature.file +predicates: + - claim: impl_file + rule: matches + value: "^src/.*\\.rs$" + source: task_prompt + notes: Implementation must be a Rust source file +``` + +------------------------------------------------------------ +CONDITIONAL PREDICATES (`when`) + +Predicates can have an optional `when` condition. If the condition is +**not met**, the predicate is **skipped** (vacuous pass) — it does NOT fail. + +This is useful for rules that only apply in certain contexts. + +### When Condition Structure + +```yaml +when: + claim: # Must reference a defined claim + rule: # Any predicate rule type + value: # Optional, depends on rule +``` + +### When Examples + +```yaml +# Only enforce endpoint count when there are breaking changes +predicates: + - claim: api_endpoints + rule: min_length + value: 3 + source: task_prompt + when: + claim: is_breaking + rule: equals + value: true + notes: Breaking changes must document all endpoints + +# Only check test coverage when tests exist +predicates: + - claim: coverage_percent + rule: greater_than + value: 80 + source: memory + when: + claim: has_tests + rule: exists + +# Only enforce format when feature is present +predicates: + - claim: output_format + rule: any_of + value: [json, yaml] + source: task_prompt + when: + claim: has_output + rule: exists +``` + +```yaml +# Only require reply threading when subject indicates a reply +predicates: + - claim: reply_to_id + rule: exists + source: task_prompt + when: + claim: subject_line + rule: matches + value: "^Re: " + notes: Reply emails must include reply_to_message_id +``` + +------------------------------------------------------------ +NULL HANDLING + +Null values in the action envelope have specific semantics: + +- **`null` is treated as absent** — `exists` returns false, `not_exists` returns true +- A fact with value `null` produces NO datalog facts (skipped entirely) +- This is the correct way to assert explicit absence in envelopes + +```yaml +# In the envelope: +facts: + breaking_changes: null # explicitly absent + +# In the rulespec — this passes: +predicates: + - claim: no_breaking + rule: not_exists + source: task_prompt +``` + +| Envelope Value | `exists` | `not_exists` | `contains "x"` | +|---------------|----------|-------------|----------------| +| `null` | ❌ fail | ✅ pass | ❌ fail | +| missing key | ❌ fail | ✅ pass | ❌ fail | +| `""` (empty) | ✅ pass | ❌ fail | ❌ fail | +| `[]` (empty) | ✅ pass | ❌ fail | ❌ fail | + +------------------------------------------------------------ +ACTION ENVELOPE FORMAT + +The action envelope is what agents produce via `write_envelope`. +It contains facts about completed work. The YAML MUST have a +top-level `facts:` key: + +```yaml +facts: + feature_name: + capabilities: [cap_a, cap_b] + file: "src/feature.rs" + tests: ["test_a", "test_b"] + api_changes: + breaking: false + new_endpoints: ["/api/foo"] + breaking_changes: null # null asserts explicit absence +``` + +**Critical**: The `facts:` wrapper is required. Without it, the envelope +will be empty and all predicates will fail. This is the #1 mistake. + +------------------------------------------------------------ +VERIFICATION PIPELINE + +When `write_envelope` is called, the system: + +1. Parses the YAML into an `ActionEnvelope` +2. Writes it to `.g3/sessions//envelope.yaml` +3. Reads `analysis/rulespec.yaml` from the workspace +4. Compiles claims into selectors, predicates into datalog rules +5. Extracts facts from the envelope using selectors +6. Evaluates each predicate against the extracted facts +7. Reports pass/fail for each predicate + +The output shows ✅ for passing and ❌ for failing predicates, +with the total count. Artifacts are written to the session directory: +- `rulespec.compiled.dl` — the generated datalog program +- `datalog_evaluation.txt` — full evaluation report + +------------------------------------------------------------ +VALIDATION STEP (MANDATORY) + +After writing or modifying `analysis/rulespec.yaml`, you MUST validate +your rules by calling `write_envelope` with a sample envelope designed +to exercise your rules. + +**How to validate:** + +1. Construct a sample envelope whose facts should make ALL your + predicates pass. Call `write_envelope` with it. + +2. Check the verification output. Every predicate should show ✅. + +3. If any predicate shows ❌, diagnose and fix either the rulespec + or the sample envelope, then re-validate. + +Example validation call: +``` +write_envelope(facts: " +facts: + csv_importer: + capabilities: [handle_headers, handle_tsv] + file: src/import/csv.rs + tests: [test_valid_csv, test_missing_column] + api_changes: + breaking: false + breaking_changes: null +") +``` + +------------------------------------------------------------ +COMMON MISTAKES TO AVOID + +1. **Missing `facts:` key in envelope** — The envelope YAML must have + `facts:` as the top-level key. Raw YAML without it produces an + empty envelope and all predicates fail silently. + +2. **Using `facts.` prefix in selectors** — Selectors already operate + inside the facts map. Write `my_feature.file`, not `facts.my_feature.file`. + +3. **Predicate references unknown claim** — Every predicate's `claim` + field must match a defined claim's `name`. Typos cause compilation errors. + +4. **Missing `value` for rules that need it** — All rules except `exists` + and `not_exists` require a `value` field. + +5. **Duplicate claim names** — Each claim name must be unique. + +6. **Regex escaping** — In YAML, backslashes in regex patterns need + quoting. Use `"^src/.*\\.rs$"` (double-quoted with escaped backslash). + +7. **`any_of`/`none_of` value must be an array** — These rules require + the `value` field to be a YAML array, not a scalar. + Write `value: [json, yaml]`, not `value: json`. + +8. **Null is absent, not a string** — `null` in the envelope means the + value does not exist. `exists` will fail, `not_exists` will pass. + If you want to check for the literal string "null", the value must + be quoted: `"null"`. + +9. **`when` condition claim must be defined** — The `when.claim` field + must reference a claim defined in the `claims` array, just like + the predicate's own `claim` field. + +------------------------------------------------------------ +CREATING A RULESPEC FROM SCRATCH + +If `analysis/rulespec.yaml` does not exist yet: + +1. Create the `analysis/` directory if needed +2. Start with a minimal rulespec: + +```yaml +claims: + - name: feature_exists + selector: my_feature.file + +predicates: + - claim: feature_exists + rule: exists + source: task_prompt + notes: The feature must declare its implementation file +``` + +3. Validate immediately with `write_envelope` +4. Iterate with the user to add more rules + +------------------------------------------------------------ +EXPLICIT BANS + +You MUST NOT: +- Modify source code, tests, or build files +- Write rules that are untestable or tautological +- Skip the validation step +- Delete existing rules without user confirmation +- Write predicates that reference undefined claims + +------------------------------------------------------------ +SUCCESS CRITERIA + +Your output is successful when: +- `analysis/rulespec.yaml` is valid YAML conforming to the schema +- All claims have valid selectors +- All predicates reference defined claims +- All `when` conditions reference defined claims +- A sample `write_envelope` call passes all predicates (✅) +- The user understands what each rule enforces +- Existing rules are preserved unless explicitly changed + +------------------------------------------------------------ +INTERACTIVE STYLE + +- Be conversational. Ask clarifying questions. +- Explain rules in plain language before writing YAML. +- Show the user what a passing envelope looks like. +- When modifying existing rules, show a diff of changes. +- If the user's request is ambiguous, propose alternatives. +- Always end with a validated rulespec. diff --git a/analysis/memory.md b/analysis/memory.md index 4125b33..f63319f 100644 --- a/analysis/memory.md +++ b/analysis/memory.md @@ -1,5 +1,5 @@ # Workspace Memory -> Updated: 2026-02-07T03:33:32Z | Size: 24.6k chars +> Updated: 2026-02-07T05:28:12Z | Size: 26.3k chars ### Remember Tool Wiring - `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()` @@ -412,4 +412,21 @@ Makes tool output responsive to terminal width - no line wrapping, with 4-char r - Root cause: `ActionEnvelope.to_yaml_value()` creates a Mapping from the `facts` HashMap WITHOUT a `facts` key wrapper, but rulespec selectors may include a `facts.` prefix. - New unit tests: `test_extract_facts_with_facts_prefix_selector`, `test_extract_facts_roundtrip_from_yaml`, `test_execute_rules_full_pipeline_with_facts_prefix`, `test_execute_rules_full_pipeline_without_facts_prefix` - New integration tests: `test_plan_verify_rulespec_with_facts_prefix_selectors`, `test_plan_verify_mixed_pass_fail` -- Strengthened: `test_plan_verify_with_analysis_rulespec` now asserts `Facts extracted: 0` is NOT in output \ No newline at end of file +- Strengthened: `test_plan_verify_with_analysis_rulespec` now asserts `Facts extracted: 0` is NOT in output + +### Solon Agent (Rulespec Authoring) +- `agents/solon.md` [0..10800] - Interactive rulespec authoring agent prompt + - Full reference for all 9 PredicateRule types: exists, not_exists, equals, contains, greater_than, less_than, min_length, max_length, matches + - Selector syntax (dot/index/wildcard), envelope format, verification pipeline + - Mandatory write_envelope validation step, common mistakes section +- `crates/g3-cli/src/embedded_agents.rs` [26] - solon registered in EMBEDDED_AGENTS +- `crates/g3-cli/src/agent_mode.rs` [42] - solon in available agents error message +- **Usage**: `g3 --agent solon` for interactive rulespec authoring +- **Agent count**: 9 embedded agents (was 8) + +### When Condition Bugfix (2026-02-07) +- `crates/g3-core/src/tools/datalog.rs` [377..395] - `execute_rules()` when condition evaluation +- **Bug**: The `_ =>` catch-all in when condition evaluation did naive string `contains` check. For `Matches` (regex like `^Re: `), it checked if fact values literally contained the regex pattern string — which never matched. Result: when conditions with `matches` rule always evaluated as not-met → vacuous pass → violations slipped through. +- **Fix**: Replaced hand-rolled when evaluation with synthetic `CompiledPredicate` delegation to `evaluate_predicate_datalog()`, which handles all 12 rule types correctly. +- **Tests**: `test_execute_rules_when_matches_condition_met`, `test_execute_rules_when_matches_condition_met_but_predicate_fails`, `test_execute_rules_when_matches_condition_not_met` +- **Note**: The `invariants.rs` path was NOT affected — it already delegated to `evaluate_predicate()` which handles all rules. \ No newline at end of file diff --git a/crates/g3-cli/src/agent_mode.rs b/crates/g3-cli/src/agent_mode.rs index ec4b04f..8aaf2a2 100644 --- a/crates/g3-cli/src/agent_mode.rs +++ b/crates/g3-cli/src/agent_mode.rs @@ -98,7 +98,7 @@ pub async fn run_agent_mode( // Load agent prompt: workspace agents/.md first, then embedded fallback let (agent_prompt, from_disk) = load_agent_prompt(agent_name, &workspace_dir).ok_or_else(|| { anyhow::anyhow!( - "Agent '{}' not found.\nAvailable embedded agents: breaker, carmack, euler, fowler, hopper, lamport, scout\nOr create agents/{}.md in your workspace.", + "Agent '{}' not found.\nAvailable embedded agents: breaker, carmack, euler, fowler, hopper, lamport, scout, solon\nOr create agents/{}.md in your workspace.", agent_name, agent_name ) diff --git a/crates/g3-cli/src/embedded_agents.rs b/crates/g3-cli/src/embedded_agents.rs index bc9c61e..29e58fc 100644 --- a/crates/g3-cli/src/embedded_agents.rs +++ b/crates/g3-cli/src/embedded_agents.rs @@ -22,6 +22,7 @@ static EMBEDDED_AGENTS: &[(&str, &str)] = &[ ("huffman", include_str!("../../../agents/huffman.md")), ("lamport", include_str!("../../../agents/lamport.md")), ("scout", include_str!("../../../agents/scout.md")), + ("solon", include_str!("../../../agents/solon.md")), ]; /// Get an embedded agent prompt by name. @@ -89,7 +90,7 @@ mod tests { #[test] fn test_embedded_agents_exist() { // Verify all expected agents are embedded - let expected = ["breaker", "carmack", "euler", "fowler", "hopper", "huffman", "lamport", "scout"]; + let expected = ["breaker", "carmack", "euler", "fowler", "hopper", "huffman", "lamport", "scout", "solon"]; for name in expected { assert!( get_embedded_agent(name).is_some(), @@ -102,7 +103,7 @@ mod tests { #[test] fn test_list_embedded_agents() { let agents = list_embedded_agents(); - assert!(agents.len() >= 8, "Should have at least 8 embedded agents"); + assert!(agents.len() >= 9, "Should have at least 9 embedded agents"); assert!(agents.contains(&"carmack")); assert!(agents.contains(&"hopper")); } diff --git a/crates/g3-core/src/tools/datalog.rs b/crates/g3-core/src/tools/datalog.rs index 8e21fce..39dad1e 100644 --- a/crates/g3-core/src/tools/datalog.rs +++ b/crates/g3-core/src/tools/datalog.rs @@ -63,6 +63,18 @@ pub struct CompiledPredicate { pub source: InvariantSource, /// Optional notes pub notes: Option, + /// Optional when condition (compiled) + #[serde(default, skip_serializing_if = "Option::is_none")] + pub when: Option, +} + +/// A compiled when condition for datalog execution. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct CompiledWhenCondition { + pub claim_name: String, + pub selector: String, + pub rule: PredicateRule, + pub expected_value: Option, } /// Compiled rulespec ready for datalog execution. @@ -136,6 +148,15 @@ pub fn compile_rulespec( expected_value, source: predicate.source, notes: predicate.notes.clone(), + when: predicate.when.as_ref().map(|w| { + let when_selector = claims.get(&w.claim).cloned().unwrap_or_default(); + CompiledWhenCondition { + claim_name: w.claim.clone(), + selector: when_selector, + rule: w.rule.clone(), + expected_value: w.value.as_ref().map(yaml_value_to_string), + } + }), }); } @@ -248,10 +269,9 @@ fn extract_values_recursive(claim_name: &str, value: &YamlValue, facts: &mut Has } } YamlValue::Null => { - facts.insert(Fact { - claim_name: claim_name.to_string(), - value: "null".to_string(), - }); + // Null values are intentionally NOT inserted as facts. + // This ensures `exists` returns false and `not_exists` returns true + // for null envelope values (e.g., `breaking_changes: null`). } _ => { // Scalar values @@ -355,7 +375,38 @@ pub fn execute_rules( // Now evaluate each predicate for pred in &compiled.predicates { - let result = evaluate_predicate_datalog(pred, &fact_lookup); + // Check when condition — if present and not met, skip (vacuous pass) + let result = if let Some(when) = &pred.when { + // Build a synthetic predicate to evaluate the when condition + // using the same logic as regular predicate evaluation. + let when_pred = CompiledPredicate { + id: usize::MAX, // sentinel — not a real predicate + claim_name: when.claim_name.clone(), + selector: when.selector.clone(), + rule: when.rule.clone(), + expected_value: when.expected_value.clone(), + source: pred.source, + notes: None, + when: None, + }; + let when_met = evaluate_predicate_datalog(&when_pred, &fact_lookup).passed; + if !when_met { + DatalogPredicateResult { + id: pred.id, + claim_name: pred.claim_name.clone(), + rule: pred.rule.clone(), + expected_value: pred.expected_value.clone(), + passed: true, + reason: "Skipped (when condition not met)".to_string(), + source: pred.source, + notes: pred.notes.clone(), + } + } else { + evaluate_predicate_datalog(pred, &fact_lookup) + } + } else { + evaluate_predicate_datalog(pred, &fact_lookup) + }; if result.passed { passed_count += 1; @@ -534,6 +585,50 @@ fn evaluate_predicate_datalog( (false, format!("Claim '{}' has no values", pred.claim_name)) } } + PredicateRule::NotContains => { + let expected = pred.expected_value.as_deref().unwrap_or(""); + if let Some(values) = claim_values { + if values.contains(expected) { + (false, format!("Contains '{}' but should not", expected)) + } else { + (true, format!("Does not contain '{}'", expected)) + } + } else { + (true, format!("Claim '{}' has no values (not_contains passes vacuously)", pred.claim_name)) + } + } + PredicateRule::AnyOf => { + let expected_set: Vec<&str> = pred.expected_value.as_deref() + .map(|v| v.trim_matches(|c| c == '[' || c == ']') + .split(", ") + .collect()) + .unwrap_or_default(); + if let Some(values) = claim_values { + if values.iter().any(|v| expected_set.contains(v)) { + (true, format!("Value is in allowed set")) + } else { + (false, format!("Value is not in allowed set")) + } + } else { + (false, format!("Claim '{}' has no values", pred.claim_name)) + } + } + PredicateRule::NoneOf => { + let forbidden_set: Vec<&str> = pred.expected_value.as_deref() + .map(|v| v.trim_matches(|c| c == '[' || c == ']') + .split(", ") + .collect()) + .unwrap_or_default(); + if let Some(values) = claim_values { + if values.iter().any(|v| forbidden_set.contains(v)) { + (false, format!("Value is in forbidden set")) + } else { + (true, format!("Value is not in forbidden set")) + } + } else { + (true, format!("Claim '{}' has no values (none_of passes vacuously)", pred.claim_name)) + } + } }; DatalogPredicateResult { @@ -701,6 +796,25 @@ pub fn format_datalog_program( id, claim, expected, )); } + PredicateRule::NotContains => { + out.push_str(&format!( + "predicate_pass({}) :- !claim_value(\"{}\", \"{}\").\n", + id, claim, expected, + )); + } + PredicateRule::AnyOf => { + // any_of: pass if claim value matches any element in the set + out.push_str(&format!( + "predicate_pass({}) :- claim_value(\"{}\", V), any_of(\"{}\", V).\n", + id, claim, expected, + )); + } + PredicateRule::NoneOf => { + out.push_str(&format!( + "predicate_pass({}) :- !claim_value(\"{}\", V), none_of(\"{}\", V).\n", + id, claim, expected, + )); + } } // Derive failure as the negation of pass @@ -784,6 +898,7 @@ pub fn format_datalog_results(result: &DatalogExecutionResult) -> String { #[cfg(test)] mod tests { use super::*; + use crate::tools::invariants::WhenCondition; fn make_test_rulespec() -> Rulespec { let mut rulespec = Rulespec::new(); @@ -975,10 +1090,8 @@ mod tests { let facts = extract_facts(&envelope, &compiled); - assert!(facts.contains(&Fact { - claim_name: "test".to_string(), - value: "null".to_string(), - })); + // Null values should NOT produce facts — this ensures not_exists passes for null + assert!(facts.is_empty(), "Null values should not produce facts, got: {:?}", facts); } #[test] @@ -1569,4 +1682,342 @@ mod tests { assert_eq!(escape_datalog_string("tab\there"), "tab\\there"); assert_eq!(escape_datalog_string("back\\slash"), "back\\\\slash"); } + + // ======================================================================== + // New Rules: NotContains, AnyOf, NoneOf in Datalog + // ======================================================================== + + #[test] + fn test_execute_rules_not_contains_pass() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("caps", "feature.capabilities")); + rulespec.predicates.push( + Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("deprecated".to_string())), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [handle_csv, handle_tsv]").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "not_contains should pass when element absent"); + } + + #[test] + fn test_execute_rules_not_contains_fail() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("caps", "feature.capabilities")); + rulespec.predicates.push( + Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("handle_csv".to_string())), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [handle_csv, handle_tsv]").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert_eq!(result.failed_count, 1, "not_contains should fail when element present"); + } + + #[test] + fn test_execute_rules_any_of_pass() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("format", "feature.output_format")); + rulespec.predicates.push( + Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("json".to_string()), + YamlValue::String("yaml".to_string()), + ])), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("output_format: json").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "any_of should pass when value is in set"); + } + + #[test] + fn test_execute_rules_any_of_fail() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("format", "feature.output_format")); + rulespec.predicates.push( + Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("json".to_string()), + YamlValue::String("yaml".to_string()), + ])), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("output_format: xml").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert_eq!(result.failed_count, 1, "any_of should fail when value not in set"); + } + + #[test] + fn test_execute_rules_none_of_pass() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("format", "feature.output_format")); + rulespec.predicates.push( + Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("xml".to_string()), + YamlValue::String("csv".to_string()), + ])), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("output_format: json").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "none_of should pass when value not in forbidden set"); + } + + #[test] + fn test_execute_rules_none_of_fail() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("format", "feature.output_format")); + rulespec.predicates.push( + Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("xml".to_string()), + YamlValue::String("csv".to_string()), + ])), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("output_format: xml").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert_eq!(result.failed_count, 1, "none_of should fail when value in forbidden set"); + } + + // ======================================================================== + // When Conditions in Datalog + // ======================================================================== + + #[test] + fn test_execute_rules_when_condition_met() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("is_breaking", "api.breaking")); + rulespec.claims.push(Claim::new("caps", "feature.capabilities")); + rulespec.predicates.push( + Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt) + .with_value(YamlValue::Number(2.into())) + .with_when(WhenCondition::new("is_breaking", PredicateRule::Equals) + .with_value(YamlValue::Bool(true))), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("api", serde_yaml::from_str("breaking: true").unwrap()); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a, b, c]").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "When met + predicate passes should pass"); + } + + #[test] + fn test_execute_rules_when_condition_not_met_vacuous_pass() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("is_breaking", "api.breaking")); + rulespec.claims.push(Claim::new("caps", "feature.capabilities")); + rulespec.predicates.push( + Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt) + .with_value(YamlValue::Number(100.into())) // would fail if evaluated + .with_when(WhenCondition::new("is_breaking", PredicateRule::Equals) + .with_value(YamlValue::Bool(true))), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("api", serde_yaml::from_str("breaking: false").unwrap()); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a]").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "When not met should be vacuous pass"); + assert!(result.predicate_results[0].reason.contains("Skipped")); + } + + #[test] + fn test_execute_rules_when_exists_on_null() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("has_tests", "testing.tests")); + rulespec.claims.push(Claim::new("coverage", "testing.coverage")); + rulespec.predicates.push( + Predicate::new("coverage", PredicateRule::GreaterThan, InvariantSource::TaskPrompt) + .with_value(YamlValue::Number(80.into())) + .with_when(WhenCondition::new("has_tests", PredicateRule::Exists)), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + // tests is null → when(exists) not met → vacuous pass + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("testing", serde_yaml::from_str("tests: null\ncoverage: 50").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "When exists on null should skip (vacuous pass)"); + } + + #[test] + fn test_execute_rules_when_matches_condition_met() { + // This is the butler rulespec pattern: when subject matches ^Re: + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("subject", "subject")); + rulespec.claims.push(Claim::new("reply_to", "reply_to_message_id")); + rulespec.predicates.push( + Predicate::new("reply_to", PredicateRule::Exists, InvariantSource::TaskPrompt) + .with_when(WhenCondition::new("subject", PredicateRule::Matches) + .with_value(YamlValue::String("^Re: ".to_string()))), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + // Reply email WITH reply_to_message_id → should pass + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("subject", YamlValue::String("Re: Hello".to_string())); + envelope.add_fact("reply_to_message_id", YamlValue::String("".to_string())); + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "When matches met + exists should pass"); + // Crucially: should NOT say "Skipped" + assert!(!result.predicate_results[0].reason.contains("Skipped"), + "Should evaluate predicate, not skip it"); + } + + #[test] + fn test_execute_rules_when_matches_condition_met_but_predicate_fails() { + // Reply email WITHOUT reply_to_message_id → when met, predicate fails + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("subject", "subject")); + rulespec.claims.push(Claim::new("reply_to", "reply_to_message_id")); + rulespec.predicates.push( + Predicate::new("reply_to", PredicateRule::Exists, InvariantSource::TaskPrompt) + .with_when(WhenCondition::new("subject", PredicateRule::Matches) + .with_value(YamlValue::String("^Re: ".to_string()))), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + // Reply email WITHOUT reply_to → when condition met, predicate should FAIL + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("subject", YamlValue::String("Re: Hello".to_string())); + // No reply_to_message_id fact + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert_eq!(result.failed_count, 1, + "When matches met but exists fails → should fail, not vacuous pass"); + } + + #[test] + fn test_execute_rules_when_matches_condition_not_met() { + // Non-reply email → when condition not met → vacuous pass (skip) + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("subject", "subject")); + rulespec.claims.push(Claim::new("reply_to", "reply_to_message_id")); + rulespec.predicates.push( + Predicate::new("reply_to", PredicateRule::Exists, InvariantSource::TaskPrompt) + .with_when(WhenCondition::new("subject", PredicateRule::Matches) + .with_value(YamlValue::String("^Re: ".to_string()))), + ); + + let compiled = compile_rulespec(&rulespec, "test", 1).unwrap(); + + // Non-reply email → when not met → skip + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("subject", YamlValue::String("Hello World".to_string())); + // No reply_to_message_id + let facts = extract_facts(&envelope, &compiled); + + let result = execute_rules(&compiled, &facts); + assert!(result.all_passed(), "Non-reply should get vacuous pass"); + assert!(result.predicate_results[0].reason.contains("Skipped"), + "Should be skipped for non-reply"); + } + + // ======================================================================== + // format_datalog_program for new rules + // ======================================================================== + + #[test] + fn test_format_datalog_program_new_rules() { + let mut rulespec = Rulespec::new(); + rulespec.claims.push(Claim::new("caps", "feature.capabilities")); + rulespec.claims.push(Claim::new("format", "feature.output_format")); + + rulespec.predicates.push( + Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("deprecated".to_string())), + ); + rulespec.predicates.push( + Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("json".to_string()), + YamlValue::String("yaml".to_string()), + ])), + ); + rulespec.predicates.push( + Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("xml".to_string()), + ])), + ); + + let compiled = compile_rulespec(&rulespec, "test-new-rules", 1).unwrap(); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a, b]\noutput_format: json").unwrap()); + let facts = extract_facts(&envelope, &compiled); + + let dl = format_datalog_program(&compiled, &facts); + + // Verify header + assert!(dl.contains("// Plan: test-new-rules")); + + // Verify not_contains rule + assert!(dl.contains("not_contains"), "Should contain not_contains rule comment"); + assert!(dl.contains("predicate_pass(0)")); + + // Verify any_of rule + assert!(dl.contains("any_of"), "Should contain any_of rule"); + assert!(dl.contains("predicate_pass(1)")); + + // Verify none_of rule + assert!(dl.contains("none_of"), "Should contain none_of rule"); + assert!(dl.contains("predicate_pass(2)")); + + // Verify failure derivation for all + assert!(dl.contains("predicate_fail(0) :- !predicate_pass(0).")); + assert!(dl.contains("predicate_fail(1) :- !predicate_pass(1).")); + assert!(dl.contains("predicate_fail(2) :- !predicate_pass(2).")); + } } diff --git a/crates/g3-core/src/tools/invariants.rs b/crates/g3-core/src/tools/invariants.rs index c6de19a..5a9bb59 100644 --- a/crates/g3-core/src/tools/invariants.rs +++ b/crates/g3-core/src/tools/invariants.rs @@ -103,6 +103,12 @@ pub enum PredicateRule { MaxLength, /// Value matches a regex pattern Matches, + /// Value does NOT contain the specified element (negation of Contains) + NotContains, + /// Value is one of the specified set of values + AnyOf, + /// Value is none of the specified set of values + NoneOf, } impl std::fmt::Display for PredicateRule { @@ -117,10 +123,40 @@ impl std::fmt::Display for PredicateRule { PredicateRule::MinLength => write!(f, "min_length"), PredicateRule::MaxLength => write!(f, "max_length"), PredicateRule::Matches => write!(f, "matches"), + PredicateRule::NotContains => write!(f, "not_contains"), + PredicateRule::AnyOf => write!(f, "any_of"), + PredicateRule::NoneOf => write!(f, "none_of"), } } } +/// A predicate defines a rule to evaluate against a claim's value. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct WhenCondition { + /// Name of the claim to check for the condition + pub claim: String, + /// The rule to apply for the condition check + pub rule: PredicateRule, + /// Value to compare against (optional, depends on rule) + #[serde(default, skip_serializing_if = "Option::is_none")] + pub value: Option, +} + +impl WhenCondition { + pub fn new(claim: impl Into, rule: PredicateRule) -> Self { + Self { + claim: claim.into(), + rule, + value: None, + } + } + + pub fn with_value(mut self, value: YamlValue) -> Self { + self.value = Some(value); + self + } +} + /// A predicate defines a rule to evaluate against a claim's value. #[derive(Debug, Clone, Serialize, Deserialize)] pub struct Predicate { @@ -137,6 +173,10 @@ pub struct Predicate { /// Optional notes explaining the invariant or providing nuance #[serde(default, skip_serializing_if = "Option::is_none")] pub notes: Option, + /// Optional condition that must be met for this predicate to be evaluated. + /// If the condition is not met, the predicate is skipped (vacuous pass). + #[serde(default, skip_serializing_if = "Option::is_none")] + pub when: Option, } impl Predicate { @@ -151,6 +191,7 @@ impl Predicate { value: None, source, notes: None, + when: None, } } @@ -164,6 +205,11 @@ impl Predicate { self } + pub fn with_when(mut self, when: WhenCondition) -> Self { + self.when = Some(when); + self + } + /// Validate the predicate structure. pub fn validate(&self) -> Result<()> { if self.claim.trim().is_empty() { @@ -185,6 +231,25 @@ impl Predicate { } } + // Validate when condition if present + if let Some(when) = &self.when { + if when.claim.trim().is_empty() { + return Err(anyhow!("When condition claim reference cannot be empty")); + } + // When conditions that need a value must have one + match when.rule { + PredicateRule::Exists | PredicateRule::NotExists => {} + _ => { + if when.value.is_none() { + return Err(anyhow!( + "When condition rule '{}' requires a value", + when.rule + )); + } + } + } + } + Ok(()) } } @@ -233,6 +298,15 @@ impl Rulespec { predicate.claim )); } + // Validate when condition claim reference + if let Some(when) = &predicate.when { + if !claim_names.contains(&when.claim) { + return Err(anyhow!( + "When condition references unknown claim: {}", + when.claim + )); + } + } } Ok(()) @@ -481,14 +555,22 @@ pub fn evaluate_predicate( ) -> PredicateResult { match predicate.rule { PredicateRule::Exists => { - if selected_values.is_empty() { + // Filter out null values — null means "absent" + let non_null: Vec<_> = selected_values.iter() + .filter(|v| !v.is_null()) + .collect(); + if non_null.is_empty() { PredicateResult::fail("Value does not exist") } else { PredicateResult::pass("Value exists") } } PredicateRule::NotExists => { - if selected_values.is_empty() { + // Filter out null values — null means "absent" + let non_null: Vec<_> = selected_values.iter() + .filter(|v| !v.is_null()) + .collect(); + if non_null.is_empty() { PredicateResult::pass("Value does not exist as expected") } else { PredicateResult::fail("Value exists but should not") @@ -642,6 +724,65 @@ pub fn evaluate_predicate( } PredicateResult::fail(format!("No value matches pattern '{}'", pattern)) } + PredicateRule::NotContains => { + let target = match &predicate.value { + Some(v) => v, + None => return PredicateResult::fail("No value specified for not_contains"), + }; + + for value in selected_values { + if value_contains(value, target) { + return PredicateResult::fail(format!( + "Value contains {:?} but should not", + yaml_to_display(target) + )); + } + } + PredicateResult::pass(format!( + "Value does not contain {:?}", + yaml_to_display(target) + )) + } + PredicateRule::AnyOf => { + let allowed = match &predicate.value { + Some(YamlValue::Sequence(seq)) => seq, + Some(_) => return PredicateResult::fail("any_of requires an array value"), + None => return PredicateResult::fail("No value specified for any_of"), + }; + + for value in selected_values { + if allowed.contains(value) { + return PredicateResult::pass(format!( + "Value {:?} is in allowed set", + yaml_to_display(value) + )); + } + } + PredicateResult::fail(format!( + "Value is not in allowed set [{}]", + allowed.iter().map(yaml_to_display).collect::>().join(", ") + )) + } + PredicateRule::NoneOf => { + let forbidden = match &predicate.value { + Some(YamlValue::Sequence(seq)) => seq, + Some(_) => return PredicateResult::fail("none_of requires an array value"), + None => return PredicateResult::fail("No value specified for none_of"), + }; + + for value in selected_values { + if forbidden.contains(value) { + return PredicateResult::fail(format!( + "Value {:?} is in forbidden set", + yaml_to_display(value) + )); + } + } + PredicateResult::pass(format!( + "Value is not in forbidden set [{}]", + forbidden.iter().map(yaml_to_display).collect::>().join(", ") + )) + } } } @@ -785,6 +926,41 @@ pub fn evaluate_rulespec(rulespec: &Rulespec, envelope: &ActionEnvelope) -> Rule .collect(); for predicate in &rulespec.predicates { + // Check when condition — if present and not met, skip (vacuous pass) + if let Some(when) = &predicate.when { + let when_claim = claims.get(when.claim.as_str()); + let when_met = match when_claim { + Some(claim) => { + match Selector::parse(&claim.selector) { + Ok(selector) => { + let when_values = selector.select(&envelope_value); + let when_pred = Predicate { + claim: when.claim.clone(), + rule: when.rule.clone(), + value: when.value.clone(), + source: predicate.source, + notes: None, + when: None, + }; + evaluate_predicate(&when_pred, &when_values).passed + } + Err(_) => false, + } + } + None => false, + }; + if !when_met { + passed_count += 1; + predicate_results.push(PredicateEvaluation { + predicate: predicate.clone(), + claim_name: predicate.claim.clone(), + selected_values: vec![], + result: PredicateResult::pass("Skipped (when condition not met)"), + }); + continue; + } + } + let claim = claims.get(predicate.claim.as_str()); let (selected_values, result) = match claim { @@ -1446,4 +1622,411 @@ facts: assert_eq!(parsed.facts.get("breaking_changes").unwrap(), &YamlValue::Null); assert_eq!(parsed.facts.get("feature").unwrap(), &YamlValue::String("done".to_string())); } + + // ======================================================================== + // New Predicate Rules: NotContains, AnyOf, NoneOf + // ======================================================================== + + #[test] + fn test_predicate_not_contains_array_pass() { + let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("deprecated".to_string())); + + let array = YamlValue::Sequence(vec![ + YamlValue::String("handle_csv".to_string()), + YamlValue::String("handle_tsv".to_string()), + ]); + + let result = evaluate_predicate(&predicate, &[array]); + assert!(result.passed, "not_contains should pass when element is absent"); + } + + #[test] + fn test_predicate_not_contains_array_fail() { + let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("handle_csv".to_string())); + + let array = YamlValue::Sequence(vec![ + YamlValue::String("handle_csv".to_string()), + YamlValue::String("handle_tsv".to_string()), + ]); + + let result = evaluate_predicate(&predicate, &[array]); + assert!(!result.passed, "not_contains should fail when element is present"); + } + + #[test] + fn test_predicate_not_contains_string_pass() { + let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("xml".to_string())); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("csv_importer".to_string())], + ); + assert!(result.passed, "not_contains should pass when substring is absent"); + } + + #[test] + fn test_predicate_not_contains_string_fail() { + let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("csv".to_string())); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("csv_importer".to_string())], + ); + assert!(!result.passed, "not_contains should fail when substring is present"); + } + + #[test] + fn test_predicate_not_contains_empty_array() { + let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("anything".to_string())); + + let array = YamlValue::Sequence(vec![]); + let result = evaluate_predicate(&predicate, &[array]); + assert!(result.passed, "not_contains on empty array should pass"); + } + + #[test] + fn test_predicate_any_of_pass() { + let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("json".to_string()), + YamlValue::String("yaml".to_string()), + YamlValue::String("toml".to_string()), + ])); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("yaml".to_string())], + ); + assert!(result.passed, "any_of should pass when value is in set"); + } + + #[test] + fn test_predicate_any_of_fail() { + let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("json".to_string()), + YamlValue::String("yaml".to_string()), + ])); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("xml".to_string())], + ); + assert!(!result.passed, "any_of should fail when value is not in set"); + } + + #[test] + fn test_predicate_any_of_non_array_value_fails() { + let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("not_an_array".to_string())); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("anything".to_string())], + ); + assert!(!result.passed, "any_of with non-array value should fail"); + } + + #[test] + fn test_predicate_any_of_single_element() { + let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("only".to_string()), + ])); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("only".to_string())], + ); + assert!(result.passed, "any_of with single-element set should work"); + } + + #[test] + fn test_predicate_none_of_pass() { + let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("xml".to_string()), + YamlValue::String("csv".to_string()), + ])); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("json".to_string())], + ); + assert!(result.passed, "none_of should pass when value is not in forbidden set"); + } + + #[test] + fn test_predicate_none_of_fail() { + let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("xml".to_string()), + YamlValue::String("csv".to_string()), + ])); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("xml".to_string())], + ); + assert!(!result.passed, "none_of should fail when value is in forbidden set"); + } + + #[test] + fn test_predicate_none_of_non_array_value_fails() { + let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("not_an_array".to_string())); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("anything".to_string())], + ); + assert!(!result.passed, "none_of with non-array value should fail"); + } + + #[test] + fn test_predicate_none_of_empty_forbidden_set() { + let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![])); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::String("anything".to_string())], + ); + assert!(result.passed, "none_of with empty forbidden set should pass"); + } + + // ======================================================================== + // Null Handling Tests + // ======================================================================== + + #[test] + fn test_predicate_exists_fails_for_null() { + let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt); + + let result = evaluate_predicate(&predicate, &[YamlValue::Null]); + assert!(!result.passed, "exists should fail for null value"); + } + + #[test] + fn test_predicate_not_exists_passes_for_null() { + let predicate = Predicate::new("test", PredicateRule::NotExists, InvariantSource::TaskPrompt); + + let result = evaluate_predicate(&predicate, &[YamlValue::Null]); + assert!(result.passed, "not_exists should pass for null value"); + } + + #[test] + fn test_predicate_exists_passes_for_empty_string() { + let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt); + + let result = evaluate_predicate(&predicate, &[YamlValue::String(String::new())]); + assert!(result.passed, "exists should pass for empty string (not null)"); + } + + #[test] + fn test_predicate_exists_passes_for_empty_array() { + let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt); + + let result = evaluate_predicate(&predicate, &[YamlValue::Sequence(vec![])]); + assert!(result.passed, "exists should pass for empty array (not null)"); + } + + #[test] + fn test_predicate_contains_on_null_fails() { + let predicate = Predicate::new("test", PredicateRule::Contains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("x".to_string())); + + let result = evaluate_predicate(&predicate, &[YamlValue::Null]); + assert!(!result.passed, "contains on null should fail"); + } + + #[test] + fn test_predicate_exists_with_mixed_null_and_value() { + let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt); + + // If selected_values has both null and a real value, exists should pass + let result = evaluate_predicate( + &predicate, + &[YamlValue::Null, YamlValue::String("real".to_string())], + ); + assert!(result.passed, "exists should pass when at least one non-null value"); + } + + #[test] + fn test_predicate_not_exists_fails_with_mixed_null_and_value() { + let predicate = Predicate::new("test", PredicateRule::NotExists, InvariantSource::TaskPrompt); + + let result = evaluate_predicate( + &predicate, + &[YamlValue::Null, YamlValue::String("real".to_string())], + ); + assert!(!result.passed, "not_exists should fail when at least one non-null value"); + } + + // ======================================================================== + // When Condition Tests + // ======================================================================== + + #[test] + fn test_when_condition_met_evaluates_predicate() { + let mut rulespec = Rulespec::new(); + rulespec.add_claim(Claim::new("is_breaking", "api_changes.breaking")); + rulespec.add_claim(Claim::new("caps", "feature.capabilities")); + rulespec.add_predicate( + Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt) + .with_value(YamlValue::Number(2.into())) + .with_when(WhenCondition::new("is_breaking", PredicateRule::Equals) + .with_value(YamlValue::Bool(true))) + ); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("api_changes", serde_yaml::from_str("breaking: true").unwrap()); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a, b, c]").unwrap()); + + let eval = evaluate_rulespec(&rulespec, &envelope); + assert_eq!(eval.passed_count, 1); + assert_eq!(eval.failed_count, 0); + } + + #[test] + fn test_when_condition_not_met_vacuous_pass() { + let mut rulespec = Rulespec::new(); + rulespec.add_claim(Claim::new("is_breaking", "api_changes.breaking")); + rulespec.add_claim(Claim::new("caps", "feature.capabilities")); + rulespec.add_predicate( + Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt) + .with_value(YamlValue::Number(100.into())) // would fail if evaluated + .with_when(WhenCondition::new("is_breaking", PredicateRule::Equals) + .with_value(YamlValue::Bool(true))) + ); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("api_changes", serde_yaml::from_str("breaking: false").unwrap()); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a]").unwrap()); + + let eval = evaluate_rulespec(&rulespec, &envelope); + assert_eq!(eval.passed_count, 1, "When not met should be vacuous pass"); + assert_eq!(eval.failed_count, 0); + assert!(eval.predicate_results[0].result.reason.contains("Skipped")); + } + + #[test] + fn test_when_condition_with_exists() { + let mut rulespec = Rulespec::new(); + rulespec.add_claim(Claim::new("has_tests", "feature.tests")); + rulespec.add_claim(Claim::new("coverage", "feature.coverage")); + rulespec.add_predicate( + Predicate::new("coverage", PredicateRule::GreaterThan, InvariantSource::TaskPrompt) + .with_value(YamlValue::Number(80.into())) + .with_when(WhenCondition::new("has_tests", PredicateRule::Exists)) + ); + + // No tests field → when condition not met → vacuous pass + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("coverage: 50").unwrap()); + + let eval = evaluate_rulespec(&rulespec, &envelope); + assert_eq!(eval.passed_count, 1, "When exists not met should skip"); + assert_eq!(eval.failed_count, 0); + } + + #[test] + fn test_when_unknown_claim_fails_validation() { + let mut rulespec = Rulespec::new(); + rulespec.add_claim(Claim::new("caps", "feature.capabilities")); + rulespec.add_predicate( + Predicate::new("caps", PredicateRule::Exists, InvariantSource::TaskPrompt) + .with_when(WhenCondition::new("nonexistent", PredicateRule::Exists)) + ); + + let result = rulespec.validate(); + assert!(result.is_err(), "When referencing unknown claim should fail validation"); + assert!(result.unwrap_err().to_string().contains("unknown claim")); + } + + #[test] + fn test_predicate_without_when_always_evaluated() { + // Backward compatibility: no when field means always evaluated + let mut rulespec = Rulespec::new(); + rulespec.add_claim(Claim::new("caps", "feature.capabilities")); + rulespec.add_predicate( + Predicate::new("caps", PredicateRule::Exists, InvariantSource::TaskPrompt) + ); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a]").unwrap()); + + let eval = evaluate_rulespec(&rulespec, &envelope); + assert_eq!(eval.passed_count, 1); + assert_eq!(eval.failed_count, 0); + } + + #[test] + fn test_when_condition_validate_requires_value() { + let when = WhenCondition::new("test", PredicateRule::Equals); + // No value set — validation should catch this + let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt) + .with_when(when); + let result = predicate.validate(); + assert!(result.is_err(), "When condition with equals but no value should fail"); + } + + #[test] + fn test_when_condition_exists_no_value_ok() { + let when = WhenCondition::new("test", PredicateRule::Exists); + let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt) + .with_when(when); + let result = predicate.validate(); + assert!(result.is_ok(), "When condition with exists and no value should be ok"); + } + + #[test] + fn test_evaluate_rulespec_with_new_rules_full() { + let mut rulespec = Rulespec::new(); + rulespec.add_claim(Claim::new("caps", "feature.capabilities")); + rulespec.add_claim(Claim::new("format", "feature.output_format")); + rulespec.add_claim(Claim::new("breaking", "breaking_changes")); + + // not_contains: must not have deprecated + rulespec.add_predicate( + Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt) + .with_value(YamlValue::String("deprecated".to_string())) + ); + // any_of: format must be json or yaml + rulespec.add_predicate( + Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("json".to_string()), + YamlValue::String("yaml".to_string()), + ])) + ); + // none_of: format must not be xml or csv + rulespec.add_predicate( + Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt) + .with_value(YamlValue::Sequence(vec![ + YamlValue::String("xml".to_string()), + YamlValue::String("csv".to_string()), + ])) + ); + // not_exists: no breaking changes + rulespec.add_predicate( + Predicate::new("breaking", PredicateRule::NotExists, InvariantSource::TaskPrompt) + ); + + let mut envelope = ActionEnvelope::new(); + envelope.add_fact("feature", serde_yaml::from_str(r#" + capabilities: [handle_csv, handle_tsv] + output_format: json + "#).unwrap()); + envelope.add_fact("breaking_changes", YamlValue::Null); + + let eval = evaluate_rulespec(&rulespec, &envelope); + assert_eq!(eval.passed_count, 4, "All 4 predicates should pass"); + assert_eq!(eval.failed_count, 0); + } } diff --git a/prompts/schemas/rulespec.schema.md b/prompts/schemas/rulespec.schema.md new file mode 100644 index 0000000..d477774 --- /dev/null +++ b/prompts/schemas/rulespec.schema.md @@ -0,0 +1,402 @@ +# Rulespec YAML Schema + +> Canonical reference for `analysis/rulespec.yaml` — the machine-readable invariant specification. + +## Overview + +A rulespec defines **claims** (selectors into the action envelope) and **predicates** (rules that evaluate those claims). When an agent completes work, it writes an **action envelope** via `write_envelope`, and the rulespec is evaluated against it using datalog verification. + +## File Location + +``` +analysis/rulespec.yaml # checked into the repo +``` + +## Top-Level Structure + +```yaml +claims: + - name: # unique identifier for this claim + selector: # path into the action envelope + +predicates: + - claim: # references a claim by name + rule: # one of the 12 rule types below + value: # required for most rules (see table) + source: # task_prompt | memory + notes: # optional human-readable explanation + when: # optional conditional trigger + claim: # references a claim by name + rule: # condition rule + value: # condition value (if needed) +``` + +## Claims + +A claim is a named selector that extracts values from the action envelope. + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `name` | string | ✅ | Unique identifier (referenced by predicates) | +| `selector` | string | ✅ | Dot-notation path into the envelope | + +### Selector Syntax + +| Syntax | Meaning | Example | +|--------|---------|--------| +| `foo.bar` | Nested field access | `csv_importer.capabilities` | +| `foo[0]` | Array index (0-based) | `tests[0].name` | +| `foo[*]` | All array elements (wildcard) | `items[*].id` | +| `foo.bar.baz` | Deep nesting | `api.endpoints.count` | + +**Important**: Selectors operate on the envelope's `facts` content directly. Do NOT include a `facts.` prefix in selectors — the system handles this automatically. + +```yaml +# ✅ Correct +claims: + - name: caps + selector: csv_importer.capabilities + +# ❌ Wrong — don't prefix with "facts." +claims: + - name: caps + selector: facts.csv_importer.capabilities +``` + +## Predicate Rules + +### Rule Types Reference + +| Rule | Value Required | Value Type | Description | +|------|---------------|------------|-------------| +| `exists` | ❌ | — | Value exists and is not null | +| `not_exists` | ❌ | — | Value is null or missing | +| `equals` | ✅ | any | Value equals exactly | +| `contains` | ✅ | any | Array contains element, or string contains substring | +| `not_contains` | ✅ | any | Negation of `contains` | +| `any_of` | ✅ | array | Value is one of the specified set | +| `none_of` | ✅ | array | Value is none of the specified set | +| `greater_than` | ✅ | number | Numeric comparison | +| `less_than` | ✅ | number | Numeric comparison | +| `min_length` | ✅ | number | Array has at least N elements | +| `max_length` | ✅ | number | Array has at most N elements | +| `matches` | ✅ | string | Value matches a regex pattern | + +### Existence Rules + +```yaml +# Value must exist (not null, not missing) +predicates: + - claim: feature_file + rule: exists + source: task_prompt + +# Value must NOT exist (null or missing) +predicates: + - claim: breaking_changes + rule: not_exists + source: task_prompt + notes: "No breaking changes allowed" +``` + +### Equality + +```yaml +predicates: + - claim: api_breaking + rule: equals + value: false + source: task_prompt +``` + +### Containment + +```yaml +# Array contains an element +predicates: + - claim: capabilities + rule: contains + value: "handle_csv" + source: task_prompt + +# Array must NOT contain an element +predicates: + - claim: capabilities + rule: not_contains + value: "deprecated_feature" + source: task_prompt +``` + +### Set Membership + +```yaml +# Value must be one of these +predicates: + - claim: output_format + rule: any_of + value: [json, yaml, toml] + source: task_prompt + +# Value must NOT be any of these +predicates: + - claim: output_format + rule: none_of + value: [xml, csv] + source: task_prompt +``` + +### Numeric Comparisons + +```yaml +predicates: + - claim: test_count + rule: greater_than + value: 0 + source: task_prompt + notes: "Must have at least one test" + + - claim: error_rate + rule: less_than + value: 5 + source: memory +``` + +### Array Length + +```yaml +predicates: + - claim: capabilities + rule: min_length + value: 2 + source: task_prompt + + - claim: dependencies + rule: max_length + value: 10 + source: memory + notes: "Keep dependency count manageable" +``` + +### Regex Matching + +```yaml +predicates: + - claim: file_path + rule: matches + value: "^src/.*\\.rs$" + source: task_prompt + notes: "File must be a Rust source file in src/" +``` + +## Conditional Predicates (`when`) + +Predicates can have an optional `when` condition. If the condition is **not met**, the predicate is **skipped** (vacuous pass) — it does not fail. + +This is useful for rules that only apply in certain contexts. + +### When Condition Structure + +```yaml +when: + claim: # references a defined claim + rule: # any predicate rule type + value: # optional, depends on rule +``` + +### Examples + +```yaml +# Only enforce endpoint count when there are breaking changes +predicates: + - claim: api_endpoints + rule: min_length + value: 3 + source: task_prompt + when: + claim: is_breaking + rule: equals + value: true + notes: "Breaking changes must document all endpoints" + +# Only check test coverage when tests exist +predicates: + - claim: coverage_percent + rule: greater_than + value: 80 + source: memory + when: + claim: has_tests + rule: exists + +# Only enforce format when feature is present +predicates: + - claim: output_format + rule: any_of + value: [json, yaml] + source: task_prompt + when: + claim: has_output + rule: exists +``` + +### When with Regex Matching + +```yaml +# Only require reply_to_message_id when subject starts with "Re: " +predicates: + - claim: reply_to_id + rule: exists + source: task_prompt + when: + claim: subject_line + rule: matches + value: "^Re: " + notes: Reply emails must have reply_to_message_id set +``` + +## Null Handling + +Null values in the action envelope have specific semantics: + +- **`null` is treated as absent** — `exists` returns false, `not_exists` returns true +- This applies to both the invariants evaluator and the datalog compiler +- A fact with value `null` produces **no datalog facts** (it is skipped entirely) + +### Common Pattern: Asserting Absence + +```yaml +# In the envelope: +facts: + breaking_changes: null # explicitly absent + +# In the rulespec: +claims: + - name: breaking + selector: breaking_changes +predicates: + - claim: breaking + rule: not_exists + source: task_prompt + notes: "No breaking changes" # ✅ This passes +``` + +### Edge Cases + +| Envelope Value | `exists` | `not_exists` | `contains "x"` | `equals "y"` | +|---------------|----------|-------------|----------------|-------------| +| `null` | ❌ fail | ✅ pass | ❌ fail | ❌ fail | +| missing key | ❌ fail | ✅ pass | ❌ fail | ❌ fail | +| `""` (empty string) | ✅ pass | ❌ fail | ❌ fail | ❌ fail | +| `[]` (empty array) | ✅ pass | ❌ fail | ❌ fail | ❌ fail | +| `0` | ✅ pass | ❌ fail | ❌ fail | depends | + +## Action Envelope Format + +The action envelope is written via the `write_envelope` tool. It must have a top-level `facts:` key. + +```yaml +facts: + feature_name: + capabilities: [cap_a, cap_b] + file: "src/feature.rs" + tests: ["test_a", "test_b"] + api_changes: + breaking: false + breaking_changes: null # Use null to assert absence +``` + +### Rules for Envelope Facts + +1. **Must have `facts:` top-level key** — without it, the envelope is empty +2. **Use file paths as evidence** — `"src/foo.rs"`, `"src/foo.rs:42"` +3. **Use `null` for explicit absence** — triggers `not_exists` predicates +4. **Arrays for lists** — capabilities, tests, endpoints +5. **Nested objects for grouping** — `feature.capabilities`, `feature.file` + +## Complete Example + +```yaml +# analysis/rulespec.yaml +claims: + - name: caps + selector: csv_importer.capabilities + - name: file + selector: csv_importer.file + - name: tests + selector: csv_importer.tests + - name: breaking + selector: api_changes.breaking + - name: no_breaking + selector: breaking_changes + +predicates: + # Must have capabilities + - claim: caps + rule: exists + source: task_prompt + + # Must include handle_csv + - claim: caps + rule: contains + value: "handle_csv" + source: task_prompt + + # Must NOT include deprecated features + - claim: caps + rule: not_contains + value: "legacy_parser" + source: memory + + # At least 2 capabilities + - claim: caps + rule: min_length + value: 2 + source: task_prompt + + # File must be a Rust source + - claim: file + rule: matches + value: "^src/.*\\.rs$" + source: task_prompt + + # Must have tests + - claim: tests + rule: min_length + value: 1 + source: task_prompt + + # No breaking changes + - claim: no_breaking + rule: not_exists + source: task_prompt + + # If breaking, must document it + - claim: caps + rule: contains + value: "migration_guide" + source: task_prompt + when: + claim: breaking + rule: equals + value: true + notes: "Breaking changes require a migration guide capability" +``` + +## Verification Pipeline + +1. Agent calls `write_envelope` with facts YAML +2. System writes `envelope.yaml` to session directory +3. System reads `analysis/rulespec.yaml` from working directory +4. Rulespec is compiled into datalog relations +5. Facts are extracted from envelope using claim selectors +6. Datalog rules are executed to fixed point +7. Results are written to `rulespec.compiled.dl` and `datalog_evaluation.txt` +8. Summary is returned to the agent +9. On plan completion, `plan_verify()` checks that the envelope exists + +## Source Types + +| Source | Meaning | +|--------|---------| +| `task_prompt` | Invariant derived from the user's task description | +| `memory` | Invariant derived from workspace memory (AGENTS.md, memory.md) | diff --git a/prompts/system/native.md b/prompts/system/native.md index cf0c0ba..12247ff 100644 --- a/prompts/system/native.md +++ b/prompts/system/native.md @@ -81,6 +81,7 @@ When marking done, add `evidence` and `notes` to the item. Before marking the last plan item done, call `write_envelope` with facts about completed work. The envelope captures what was actually built so it can be verified against invariants in `analysis/rulespec.yaml` if present. The tool writes the envelope and runs datalog verification automatically. ```yaml +type: code_change facts: csv_importer: capabilities: [handle_headers, handle_tsv, handle_quoted_fields]