Add rulespec extensions: new predicate rules, when conditions, null handling, solon agent

Features:
- New predicate rules: NotContains, AnyOf, NoneOf
- Conditional predicates via when clauses (WhenCondition/CompiledWhenCondition)
- Null handling: YAML null treated as absent for exists/not_exists
- Solon agent for rulespec authoring (agents/solon.md)
- Rulespec schema documentation (prompts/schemas/rulespec.schema.md)

Bugfix:
- Fixed when condition evaluation in datalog path: catch-all branch did
  naive string contains instead of delegating to evaluate_predicate_datalog().
  Rules like matches (regex) were silently ignored, causing vacuous pass
  and letting violations through. Now delegates to evaluate_predicate_datalog()
  which handles all 12 rule types correctly.

Tests: 34 new tests covering all new rules, null handling, when conditions,
and the when+matches bugfix (butler rulespec pattern).
This commit is contained in:
Dhanji R. Prasanna
2026-02-07 16:38:27 +11:00
parent 328eecfcad
commit edbae60ff3
8 changed files with 1958 additions and 16 deletions

487
agents/solon.md Normal file
View File

@@ -0,0 +1,487 @@
SYSTEM PROMPT — "Solon" (Rulespec Authoring Agent)
You are Solon: an interactive rulespec authoring agent.
Your job is to help users create, refine, and validate invariant rules
in `analysis/rulespec.yaml` — the machine-readable contract that governs
what `write_envelope` verifies at plan completion.
You are named for the Athenian lawgiver. You write precise, enforceable rules.
------------------------------------------------------------
PRIME DIRECTIVE
You author **rulespec rules** — claims and predicates that define invariants
over action envelopes. Every rule you write must be:
1. Syntactically valid YAML conforming to the rulespec schema
2. Semantically meaningful (tests something the user cares about)
3. **Validated** — you MUST call `write_envelope` with a sample envelope
that exercises your rules before finishing
You operate ONLY on `analysis/rulespec.yaml`. You do not modify source code,
tests, build files, or any other configuration.
The canonical schema reference is at `prompts/schemas/rulespec.schema.md`.
------------------------------------------------------------
WORKFLOW
1. **Understand** — Ask the user what invariants they want to enforce.
What facts should agents produce? What properties must hold?
2. **Read** — Load the current `analysis/rulespec.yaml` (if it exists)
to understand existing rules. Never duplicate or contradict them
without explicit user consent.
3. **Author** — Write claims and predicates using the schema below.
Explain each rule to the user in plain language.
4. **Validate** — Call `write_envelope` with a sample envelope that
should PASS all your new rules. Inspect the verification output.
If any rule fails, fix it and re-validate.
5. **Confirm** — Show the user the final rulespec and verification results.
Step 4 is NON-NEGOTIABLE. Never finish without validating.
------------------------------------------------------------
RULESPEC SCHEMA
The file `analysis/rulespec.yaml` has two top-level arrays:
```yaml
claims:
- name: <claim_name> # Unique identifier (referenced by predicates)
selector: <selector_path> # Path into the action envelope
predicates:
- claim: <claim_name> # Must reference a defined claim
rule: <rule_type> # One of the 12 predicate rules below
value: <expected_value> # Required for most rules (optional for exists/not_exists)
source: task_prompt # Either "task_prompt" or "memory"
notes: <explanation> # Optional human-readable explanation
when: # Optional conditional trigger
claim: <claim_name> # Must reference a defined claim
rule: <rule_type> # Condition rule type
value: <value> # Condition value (if needed)
```
------------------------------------------------------------
SELECTOR SYNTAX
Selectors navigate the envelope's fact structure using path notation:
| Syntax | Meaning | Example |
|--------|---------|--------|
| `foo.bar` | Nested field access | `csv_importer.file` |
| `foo[0]` | Array index (0-based) | `tests[0]` |
| `foo[*].id` | Wildcard (all elements) | `items[*].name` |
| `foo.bar.baz` | Deep nesting | `api.endpoints.count` |
**IMPORTANT**: Selectors operate on the envelope's `facts` map directly.
Do NOT prefix selectors with `facts.` — the system already unwraps the
`facts` key. Write `my_feature.capabilities`, not `facts.my_feature.capabilities`.
While selectors with a `facts.` prefix will work (there is a fallback),
it is unnecessary and should be avoided for clarity.
------------------------------------------------------------
THE 12 PREDICATE RULES
| Rule | Value Required | Value Type | What It Checks |
|------|---------------|------------|----------------|
| `exists` | No | — | Value is present and not null |
| `not_exists` | No | — | Value is null or missing |
| `equals` | Yes | any | Selected value exactly equals expected |
| `contains` | Yes | any | Array contains element, or string contains substring |
| `not_contains` | Yes | any | Negation of contains — value must NOT be present |
| `any_of` | Yes | array | Value is one of the specified set |
| `none_of` | Yes | array | Value is none of the specified set |
| `greater_than` | Yes | number | Numeric value > expected |
| `less_than` | Yes | number | Numeric value < expected |
| `min_length` | Yes | number | Array has at least N elements |
| `max_length` | Yes | number | Array has at most N elements |
| `matches` | Yes | string | String value matches a regex pattern |
### Rule Details & Examples
**exists** — Assert a value is present (not null):
```yaml
claims:
- name: has_file
selector: my_feature.file
predicates:
- claim: has_file
rule: exists
source: task_prompt
notes: Feature must specify its implementation file
```
**not_exists** — Assert a value is absent or null:
```yaml
claims:
- name: no_breaking
selector: breaking_changes
predicates:
- claim: no_breaking
rule: not_exists
source: task_prompt
notes: No breaking changes allowed
```
**equals** — Exact value match:
```yaml
claims:
- name: api_breaking
selector: api_changes.breaking
predicates:
- claim: api_breaking
rule: equals
value: false
source: task_prompt
```
**contains** — Element in array or substring in string:
```yaml
claims:
- name: capabilities
selector: csv_importer.capabilities
predicates:
- claim: capabilities
rule: contains
value: handle_tsv
source: task_prompt
notes: Must support TSV format
```
**not_contains** — Element must NOT be in array or substring NOT in string:
```yaml
claims:
- name: capabilities
selector: csv_importer.capabilities
predicates:
- claim: capabilities
rule: not_contains
value: deprecated_parser
source: task_prompt
notes: Must not use the deprecated parser
```
**any_of** — Value must be one of a set (value must be an array):
```yaml
claims:
- name: output_format
selector: feature.output_format
predicates:
- claim: output_format
rule: any_of
value: [json, yaml, toml]
source: task_prompt
notes: Output must be a supported format
```
**none_of** — Value must NOT be any of a set (value must be an array):
```yaml
claims:
- name: output_format
selector: feature.output_format
predicates:
- claim: output_format
rule: none_of
value: [xml, csv]
source: task_prompt
notes: XML and CSV are not supported
```
**greater_than / less_than** — Numeric comparisons:
```yaml
claims:
- name: test_count
selector: metrics.test_count
predicates:
- claim: test_count
rule: greater_than
value: 0
source: task_prompt
notes: Must have at least one test
```
**min_length / max_length** — Array size bounds:
```yaml
claims:
- name: endpoints
selector: api.endpoints
predicates:
- claim: endpoints
rule: min_length
value: 2
source: task_prompt
notes: API must expose at least 2 endpoints
```
**matches** — Regex pattern matching:
```yaml
claims:
- name: impl_file
selector: feature.file
predicates:
- claim: impl_file
rule: matches
value: "^src/.*\\.rs$"
source: task_prompt
notes: Implementation must be a Rust source file
```
------------------------------------------------------------
CONDITIONAL PREDICATES (`when`)
Predicates can have an optional `when` condition. If the condition is
**not met**, the predicate is **skipped** (vacuous pass) — it does NOT fail.
This is useful for rules that only apply in certain contexts.
### When Condition Structure
```yaml
when:
claim: <claim_name> # Must reference a defined claim
rule: <rule_type> # Any predicate rule type
value: <value> # Optional, depends on rule
```
### When Examples
```yaml
# Only enforce endpoint count when there are breaking changes
predicates:
- claim: api_endpoints
rule: min_length
value: 3
source: task_prompt
when:
claim: is_breaking
rule: equals
value: true
notes: Breaking changes must document all endpoints
# Only check test coverage when tests exist
predicates:
- claim: coverage_percent
rule: greater_than
value: 80
source: memory
when:
claim: has_tests
rule: exists
# Only enforce format when feature is present
predicates:
- claim: output_format
rule: any_of
value: [json, yaml]
source: task_prompt
when:
claim: has_output
rule: exists
```
```yaml
# Only require reply threading when subject indicates a reply
predicates:
- claim: reply_to_id
rule: exists
source: task_prompt
when:
claim: subject_line
rule: matches
value: "^Re: "
notes: Reply emails must include reply_to_message_id
```
------------------------------------------------------------
NULL HANDLING
Null values in the action envelope have specific semantics:
- **`null` is treated as absent** — `exists` returns false, `not_exists` returns true
- A fact with value `null` produces NO datalog facts (skipped entirely)
- This is the correct way to assert explicit absence in envelopes
```yaml
# In the envelope:
facts:
breaking_changes: null # explicitly absent
# In the rulespec — this passes:
predicates:
- claim: no_breaking
rule: not_exists
source: task_prompt
```
| Envelope Value | `exists` | `not_exists` | `contains "x"` |
|---------------|----------|-------------|----------------|
| `null` | ❌ fail | ✅ pass | ❌ fail |
| missing key | ❌ fail | ✅ pass | ❌ fail |
| `""` (empty) | ✅ pass | ❌ fail | ❌ fail |
| `[]` (empty) | ✅ pass | ❌ fail | ❌ fail |
------------------------------------------------------------
ACTION ENVELOPE FORMAT
The action envelope is what agents produce via `write_envelope`.
It contains facts about completed work. The YAML MUST have a
top-level `facts:` key:
```yaml
facts:
feature_name:
capabilities: [cap_a, cap_b]
file: "src/feature.rs"
tests: ["test_a", "test_b"]
api_changes:
breaking: false
new_endpoints: ["/api/foo"]
breaking_changes: null # null asserts explicit absence
```
**Critical**: The `facts:` wrapper is required. Without it, the envelope
will be empty and all predicates will fail. This is the #1 mistake.
------------------------------------------------------------
VERIFICATION PIPELINE
When `write_envelope` is called, the system:
1. Parses the YAML into an `ActionEnvelope`
2. Writes it to `.g3/sessions/<id>/envelope.yaml`
3. Reads `analysis/rulespec.yaml` from the workspace
4. Compiles claims into selectors, predicates into datalog rules
5. Extracts facts from the envelope using selectors
6. Evaluates each predicate against the extracted facts
7. Reports pass/fail for each predicate
The output shows ✅ for passing and ❌ for failing predicates,
with the total count. Artifacts are written to the session directory:
- `rulespec.compiled.dl` — the generated datalog program
- `datalog_evaluation.txt` — full evaluation report
------------------------------------------------------------
VALIDATION STEP (MANDATORY)
After writing or modifying `analysis/rulespec.yaml`, you MUST validate
your rules by calling `write_envelope` with a sample envelope designed
to exercise your rules.
**How to validate:**
1. Construct a sample envelope whose facts should make ALL your
predicates pass. Call `write_envelope` with it.
2. Check the verification output. Every predicate should show ✅.
3. If any predicate shows ❌, diagnose and fix either the rulespec
or the sample envelope, then re-validate.
Example validation call:
```
write_envelope(facts: "
facts:
csv_importer:
capabilities: [handle_headers, handle_tsv]
file: src/import/csv.rs
tests: [test_valid_csv, test_missing_column]
api_changes:
breaking: false
breaking_changes: null
")
```
------------------------------------------------------------
COMMON MISTAKES TO AVOID
1. **Missing `facts:` key in envelope** — The envelope YAML must have
`facts:` as the top-level key. Raw YAML without it produces an
empty envelope and all predicates fail silently.
2. **Using `facts.` prefix in selectors** — Selectors already operate
inside the facts map. Write `my_feature.file`, not `facts.my_feature.file`.
3. **Predicate references unknown claim** — Every predicate's `claim`
field must match a defined claim's `name`. Typos cause compilation errors.
4. **Missing `value` for rules that need it** — All rules except `exists`
and `not_exists` require a `value` field.
5. **Duplicate claim names** — Each claim name must be unique.
6. **Regex escaping** — In YAML, backslashes in regex patterns need
quoting. Use `"^src/.*\\.rs$"` (double-quoted with escaped backslash).
7. **`any_of`/`none_of` value must be an array** — These rules require
the `value` field to be a YAML array, not a scalar.
Write `value: [json, yaml]`, not `value: json`.
8. **Null is absent, not a string**`null` in the envelope means the
value does not exist. `exists` will fail, `not_exists` will pass.
If you want to check for the literal string "null", the value must
be quoted: `"null"`.
9. **`when` condition claim must be defined** — The `when.claim` field
must reference a claim defined in the `claims` array, just like
the predicate's own `claim` field.
------------------------------------------------------------
CREATING A RULESPEC FROM SCRATCH
If `analysis/rulespec.yaml` does not exist yet:
1. Create the `analysis/` directory if needed
2. Start with a minimal rulespec:
```yaml
claims:
- name: feature_exists
selector: my_feature.file
predicates:
- claim: feature_exists
rule: exists
source: task_prompt
notes: The feature must declare its implementation file
```
3. Validate immediately with `write_envelope`
4. Iterate with the user to add more rules
------------------------------------------------------------
EXPLICIT BANS
You MUST NOT:
- Modify source code, tests, or build files
- Write rules that are untestable or tautological
- Skip the validation step
- Delete existing rules without user confirmation
- Write predicates that reference undefined claims
------------------------------------------------------------
SUCCESS CRITERIA
Your output is successful when:
- `analysis/rulespec.yaml` is valid YAML conforming to the schema
- All claims have valid selectors
- All predicates reference defined claims
- All `when` conditions reference defined claims
- A sample `write_envelope` call passes all predicates (✅)
- The user understands what each rule enforces
- Existing rules are preserved unless explicitly changed
------------------------------------------------------------
INTERACTIVE STYLE
- Be conversational. Ask clarifying questions.
- Explain rules in plain language before writing YAML.
- Show the user what a passing envelope looks like.
- When modifying existing rules, show a diff of changes.
- If the user's request is ambiguous, propose alternatives.
- Always end with a validated rulespec.

View File

@@ -1,5 +1,5 @@
# Workspace Memory # Workspace Memory
> Updated: 2026-02-07T03:33:32Z | Size: 24.6k chars > Updated: 2026-02-07T05:28:12Z | Size: 26.3k chars
### Remember Tool Wiring ### Remember Tool Wiring
- `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()` - `crates/g3-core/src/tools/memory.rs` [0..5000] - `execute_remember()`, `get_memory_path()`, `merge_memory()`
@@ -413,3 +413,20 @@ Makes tool output responsive to terminal width - no line wrapping, with 4-char r
- New unit tests: `test_extract_facts_with_facts_prefix_selector`, `test_extract_facts_roundtrip_from_yaml`, `test_execute_rules_full_pipeline_with_facts_prefix`, `test_execute_rules_full_pipeline_without_facts_prefix` - New unit tests: `test_extract_facts_with_facts_prefix_selector`, `test_extract_facts_roundtrip_from_yaml`, `test_execute_rules_full_pipeline_with_facts_prefix`, `test_execute_rules_full_pipeline_without_facts_prefix`
- New integration tests: `test_plan_verify_rulespec_with_facts_prefix_selectors`, `test_plan_verify_mixed_pass_fail` - New integration tests: `test_plan_verify_rulespec_with_facts_prefix_selectors`, `test_plan_verify_mixed_pass_fail`
- Strengthened: `test_plan_verify_with_analysis_rulespec` now asserts `Facts extracted: 0` is NOT in output - Strengthened: `test_plan_verify_with_analysis_rulespec` now asserts `Facts extracted: 0` is NOT in output
### Solon Agent (Rulespec Authoring)
- `agents/solon.md` [0..10800] - Interactive rulespec authoring agent prompt
- Full reference for all 9 PredicateRule types: exists, not_exists, equals, contains, greater_than, less_than, min_length, max_length, matches
- Selector syntax (dot/index/wildcard), envelope format, verification pipeline
- Mandatory write_envelope validation step, common mistakes section
- `crates/g3-cli/src/embedded_agents.rs` [26] - solon registered in EMBEDDED_AGENTS
- `crates/g3-cli/src/agent_mode.rs` [42] - solon in available agents error message
- **Usage**: `g3 --agent solon` for interactive rulespec authoring
- **Agent count**: 9 embedded agents (was 8)
### When Condition Bugfix (2026-02-07)
- `crates/g3-core/src/tools/datalog.rs` [377..395] - `execute_rules()` when condition evaluation
- **Bug**: The `_ =>` catch-all in when condition evaluation did naive string `contains` check. For `Matches` (regex like `^Re: `), it checked if fact values literally contained the regex pattern string — which never matched. Result: when conditions with `matches` rule always evaluated as not-met → vacuous pass → violations slipped through.
- **Fix**: Replaced hand-rolled when evaluation with synthetic `CompiledPredicate` delegation to `evaluate_predicate_datalog()`, which handles all 12 rule types correctly.
- **Tests**: `test_execute_rules_when_matches_condition_met`, `test_execute_rules_when_matches_condition_met_but_predicate_fails`, `test_execute_rules_when_matches_condition_not_met`
- **Note**: The `invariants.rs` path was NOT affected — it already delegated to `evaluate_predicate()` which handles all rules.

View File

@@ -98,7 +98,7 @@ pub async fn run_agent_mode(
// Load agent prompt: workspace agents/<name>.md first, then embedded fallback // Load agent prompt: workspace agents/<name>.md first, then embedded fallback
let (agent_prompt, from_disk) = load_agent_prompt(agent_name, &workspace_dir).ok_or_else(|| { let (agent_prompt, from_disk) = load_agent_prompt(agent_name, &workspace_dir).ok_or_else(|| {
anyhow::anyhow!( anyhow::anyhow!(
"Agent '{}' not found.\nAvailable embedded agents: breaker, carmack, euler, fowler, hopper, lamport, scout\nOr create agents/{}.md in your workspace.", "Agent '{}' not found.\nAvailable embedded agents: breaker, carmack, euler, fowler, hopper, lamport, scout, solon\nOr create agents/{}.md in your workspace.",
agent_name, agent_name,
agent_name agent_name
) )

View File

@@ -22,6 +22,7 @@ static EMBEDDED_AGENTS: &[(&str, &str)] = &[
("huffman", include_str!("../../../agents/huffman.md")), ("huffman", include_str!("../../../agents/huffman.md")),
("lamport", include_str!("../../../agents/lamport.md")), ("lamport", include_str!("../../../agents/lamport.md")),
("scout", include_str!("../../../agents/scout.md")), ("scout", include_str!("../../../agents/scout.md")),
("solon", include_str!("../../../agents/solon.md")),
]; ];
/// Get an embedded agent prompt by name. /// Get an embedded agent prompt by name.
@@ -89,7 +90,7 @@ mod tests {
#[test] #[test]
fn test_embedded_agents_exist() { fn test_embedded_agents_exist() {
// Verify all expected agents are embedded // Verify all expected agents are embedded
let expected = ["breaker", "carmack", "euler", "fowler", "hopper", "huffman", "lamport", "scout"]; let expected = ["breaker", "carmack", "euler", "fowler", "hopper", "huffman", "lamport", "scout", "solon"];
for name in expected { for name in expected {
assert!( assert!(
get_embedded_agent(name).is_some(), get_embedded_agent(name).is_some(),
@@ -102,7 +103,7 @@ mod tests {
#[test] #[test]
fn test_list_embedded_agents() { fn test_list_embedded_agents() {
let agents = list_embedded_agents(); let agents = list_embedded_agents();
assert!(agents.len() >= 8, "Should have at least 8 embedded agents"); assert!(agents.len() >= 9, "Should have at least 9 embedded agents");
assert!(agents.contains(&"carmack")); assert!(agents.contains(&"carmack"));
assert!(agents.contains(&"hopper")); assert!(agents.contains(&"hopper"));
} }

View File

@@ -63,6 +63,18 @@ pub struct CompiledPredicate {
pub source: InvariantSource, pub source: InvariantSource,
/// Optional notes /// Optional notes
pub notes: Option<String>, pub notes: Option<String>,
/// Optional when condition (compiled)
#[serde(default, skip_serializing_if = "Option::is_none")]
pub when: Option<CompiledWhenCondition>,
}
/// A compiled when condition for datalog execution.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CompiledWhenCondition {
pub claim_name: String,
pub selector: String,
pub rule: PredicateRule,
pub expected_value: Option<String>,
} }
/// Compiled rulespec ready for datalog execution. /// Compiled rulespec ready for datalog execution.
@@ -136,6 +148,15 @@ pub fn compile_rulespec(
expected_value, expected_value,
source: predicate.source, source: predicate.source,
notes: predicate.notes.clone(), notes: predicate.notes.clone(),
when: predicate.when.as_ref().map(|w| {
let when_selector = claims.get(&w.claim).cloned().unwrap_or_default();
CompiledWhenCondition {
claim_name: w.claim.clone(),
selector: when_selector,
rule: w.rule.clone(),
expected_value: w.value.as_ref().map(yaml_value_to_string),
}
}),
}); });
} }
@@ -248,10 +269,9 @@ fn extract_values_recursive(claim_name: &str, value: &YamlValue, facts: &mut Has
} }
} }
YamlValue::Null => { YamlValue::Null => {
facts.insert(Fact { // Null values are intentionally NOT inserted as facts.
claim_name: claim_name.to_string(), // This ensures `exists` returns false and `not_exists` returns true
value: "null".to_string(), // for null envelope values (e.g., `breaking_changes: null`).
});
} }
_ => { _ => {
// Scalar values // Scalar values
@@ -355,7 +375,38 @@ pub fn execute_rules(
// Now evaluate each predicate // Now evaluate each predicate
for pred in &compiled.predicates { for pred in &compiled.predicates {
let result = evaluate_predicate_datalog(pred, &fact_lookup); // Check when condition — if present and not met, skip (vacuous pass)
let result = if let Some(when) = &pred.when {
// Build a synthetic predicate to evaluate the when condition
// using the same logic as regular predicate evaluation.
let when_pred = CompiledPredicate {
id: usize::MAX, // sentinel — not a real predicate
claim_name: when.claim_name.clone(),
selector: when.selector.clone(),
rule: when.rule.clone(),
expected_value: when.expected_value.clone(),
source: pred.source,
notes: None,
when: None,
};
let when_met = evaluate_predicate_datalog(&when_pred, &fact_lookup).passed;
if !when_met {
DatalogPredicateResult {
id: pred.id,
claim_name: pred.claim_name.clone(),
rule: pred.rule.clone(),
expected_value: pred.expected_value.clone(),
passed: true,
reason: "Skipped (when condition not met)".to_string(),
source: pred.source,
notes: pred.notes.clone(),
}
} else {
evaluate_predicate_datalog(pred, &fact_lookup)
}
} else {
evaluate_predicate_datalog(pred, &fact_lookup)
};
if result.passed { if result.passed {
passed_count += 1; passed_count += 1;
@@ -534,6 +585,50 @@ fn evaluate_predicate_datalog(
(false, format!("Claim '{}' has no values", pred.claim_name)) (false, format!("Claim '{}' has no values", pred.claim_name))
} }
} }
PredicateRule::NotContains => {
let expected = pred.expected_value.as_deref().unwrap_or("");
if let Some(values) = claim_values {
if values.contains(expected) {
(false, format!("Contains '{}' but should not", expected))
} else {
(true, format!("Does not contain '{}'", expected))
}
} else {
(true, format!("Claim '{}' has no values (not_contains passes vacuously)", pred.claim_name))
}
}
PredicateRule::AnyOf => {
let expected_set: Vec<&str> = pred.expected_value.as_deref()
.map(|v| v.trim_matches(|c| c == '[' || c == ']')
.split(", ")
.collect())
.unwrap_or_default();
if let Some(values) = claim_values {
if values.iter().any(|v| expected_set.contains(v)) {
(true, format!("Value is in allowed set"))
} else {
(false, format!("Value is not in allowed set"))
}
} else {
(false, format!("Claim '{}' has no values", pred.claim_name))
}
}
PredicateRule::NoneOf => {
let forbidden_set: Vec<&str> = pred.expected_value.as_deref()
.map(|v| v.trim_matches(|c| c == '[' || c == ']')
.split(", ")
.collect())
.unwrap_or_default();
if let Some(values) = claim_values {
if values.iter().any(|v| forbidden_set.contains(v)) {
(false, format!("Value is in forbidden set"))
} else {
(true, format!("Value is not in forbidden set"))
}
} else {
(true, format!("Claim '{}' has no values (none_of passes vacuously)", pred.claim_name))
}
}
}; };
DatalogPredicateResult { DatalogPredicateResult {
@@ -701,6 +796,25 @@ pub fn format_datalog_program(
id, claim, expected, id, claim, expected,
)); ));
} }
PredicateRule::NotContains => {
out.push_str(&format!(
"predicate_pass({}) :- !claim_value(\"{}\", \"{}\").\n",
id, claim, expected,
));
}
PredicateRule::AnyOf => {
// any_of: pass if claim value matches any element in the set
out.push_str(&format!(
"predicate_pass({}) :- claim_value(\"{}\", V), any_of(\"{}\", V).\n",
id, claim, expected,
));
}
PredicateRule::NoneOf => {
out.push_str(&format!(
"predicate_pass({}) :- !claim_value(\"{}\", V), none_of(\"{}\", V).\n",
id, claim, expected,
));
}
} }
// Derive failure as the negation of pass // Derive failure as the negation of pass
@@ -784,6 +898,7 @@ pub fn format_datalog_results(result: &DatalogExecutionResult) -> String {
#[cfg(test)] #[cfg(test)]
mod tests { mod tests {
use super::*; use super::*;
use crate::tools::invariants::WhenCondition;
fn make_test_rulespec() -> Rulespec { fn make_test_rulespec() -> Rulespec {
let mut rulespec = Rulespec::new(); let mut rulespec = Rulespec::new();
@@ -975,10 +1090,8 @@ mod tests {
let facts = extract_facts(&envelope, &compiled); let facts = extract_facts(&envelope, &compiled);
assert!(facts.contains(&Fact { // Null values should NOT produce facts — this ensures not_exists passes for null
claim_name: "test".to_string(), assert!(facts.is_empty(), "Null values should not produce facts, got: {:?}", facts);
value: "null".to_string(),
}));
} }
#[test] #[test]
@@ -1569,4 +1682,342 @@ mod tests {
assert_eq!(escape_datalog_string("tab\there"), "tab\\there"); assert_eq!(escape_datalog_string("tab\there"), "tab\\there");
assert_eq!(escape_datalog_string("back\\slash"), "back\\\\slash"); assert_eq!(escape_datalog_string("back\\slash"), "back\\\\slash");
} }
// ========================================================================
// New Rules: NotContains, AnyOf, NoneOf in Datalog
// ========================================================================
#[test]
fn test_execute_rules_not_contains_pass() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("caps", "feature.capabilities"));
rulespec.predicates.push(
Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("deprecated".to_string())),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [handle_csv, handle_tsv]").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "not_contains should pass when element absent");
}
#[test]
fn test_execute_rules_not_contains_fail() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("caps", "feature.capabilities"));
rulespec.predicates.push(
Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("handle_csv".to_string())),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [handle_csv, handle_tsv]").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert_eq!(result.failed_count, 1, "not_contains should fail when element present");
}
#[test]
fn test_execute_rules_any_of_pass() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("format", "feature.output_format"));
rulespec.predicates.push(
Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("json".to_string()),
YamlValue::String("yaml".to_string()),
])),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("output_format: json").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "any_of should pass when value is in set");
}
#[test]
fn test_execute_rules_any_of_fail() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("format", "feature.output_format"));
rulespec.predicates.push(
Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("json".to_string()),
YamlValue::String("yaml".to_string()),
])),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("output_format: xml").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert_eq!(result.failed_count, 1, "any_of should fail when value not in set");
}
#[test]
fn test_execute_rules_none_of_pass() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("format", "feature.output_format"));
rulespec.predicates.push(
Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("xml".to_string()),
YamlValue::String("csv".to_string()),
])),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("output_format: json").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "none_of should pass when value not in forbidden set");
}
#[test]
fn test_execute_rules_none_of_fail() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("format", "feature.output_format"));
rulespec.predicates.push(
Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("xml".to_string()),
YamlValue::String("csv".to_string()),
])),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("output_format: xml").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert_eq!(result.failed_count, 1, "none_of should fail when value in forbidden set");
}
// ========================================================================
// When Conditions in Datalog
// ========================================================================
#[test]
fn test_execute_rules_when_condition_met() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("is_breaking", "api.breaking"));
rulespec.claims.push(Claim::new("caps", "feature.capabilities"));
rulespec.predicates.push(
Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt)
.with_value(YamlValue::Number(2.into()))
.with_when(WhenCondition::new("is_breaking", PredicateRule::Equals)
.with_value(YamlValue::Bool(true))),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("api", serde_yaml::from_str("breaking: true").unwrap());
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a, b, c]").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "When met + predicate passes should pass");
}
#[test]
fn test_execute_rules_when_condition_not_met_vacuous_pass() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("is_breaking", "api.breaking"));
rulespec.claims.push(Claim::new("caps", "feature.capabilities"));
rulespec.predicates.push(
Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt)
.with_value(YamlValue::Number(100.into())) // would fail if evaluated
.with_when(WhenCondition::new("is_breaking", PredicateRule::Equals)
.with_value(YamlValue::Bool(true))),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("api", serde_yaml::from_str("breaking: false").unwrap());
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a]").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "When not met should be vacuous pass");
assert!(result.predicate_results[0].reason.contains("Skipped"));
}
#[test]
fn test_execute_rules_when_exists_on_null() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("has_tests", "testing.tests"));
rulespec.claims.push(Claim::new("coverage", "testing.coverage"));
rulespec.predicates.push(
Predicate::new("coverage", PredicateRule::GreaterThan, InvariantSource::TaskPrompt)
.with_value(YamlValue::Number(80.into()))
.with_when(WhenCondition::new("has_tests", PredicateRule::Exists)),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
// tests is null → when(exists) not met → vacuous pass
let mut envelope = ActionEnvelope::new();
envelope.add_fact("testing", serde_yaml::from_str("tests: null\ncoverage: 50").unwrap());
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "When exists on null should skip (vacuous pass)");
}
#[test]
fn test_execute_rules_when_matches_condition_met() {
// This is the butler rulespec pattern: when subject matches ^Re:
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("subject", "subject"));
rulespec.claims.push(Claim::new("reply_to", "reply_to_message_id"));
rulespec.predicates.push(
Predicate::new("reply_to", PredicateRule::Exists, InvariantSource::TaskPrompt)
.with_when(WhenCondition::new("subject", PredicateRule::Matches)
.with_value(YamlValue::String("^Re: ".to_string()))),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
// Reply email WITH reply_to_message_id → should pass
let mut envelope = ActionEnvelope::new();
envelope.add_fact("subject", YamlValue::String("Re: Hello".to_string()));
envelope.add_fact("reply_to_message_id", YamlValue::String("<abc@example.com>".to_string()));
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "When matches met + exists should pass");
// Crucially: should NOT say "Skipped"
assert!(!result.predicate_results[0].reason.contains("Skipped"),
"Should evaluate predicate, not skip it");
}
#[test]
fn test_execute_rules_when_matches_condition_met_but_predicate_fails() {
// Reply email WITHOUT reply_to_message_id → when met, predicate fails
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("subject", "subject"));
rulespec.claims.push(Claim::new("reply_to", "reply_to_message_id"));
rulespec.predicates.push(
Predicate::new("reply_to", PredicateRule::Exists, InvariantSource::TaskPrompt)
.with_when(WhenCondition::new("subject", PredicateRule::Matches)
.with_value(YamlValue::String("^Re: ".to_string()))),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
// Reply email WITHOUT reply_to → when condition met, predicate should FAIL
let mut envelope = ActionEnvelope::new();
envelope.add_fact("subject", YamlValue::String("Re: Hello".to_string()));
// No reply_to_message_id fact
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert_eq!(result.failed_count, 1,
"When matches met but exists fails → should fail, not vacuous pass");
}
#[test]
fn test_execute_rules_when_matches_condition_not_met() {
// Non-reply email → when condition not met → vacuous pass (skip)
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("subject", "subject"));
rulespec.claims.push(Claim::new("reply_to", "reply_to_message_id"));
rulespec.predicates.push(
Predicate::new("reply_to", PredicateRule::Exists, InvariantSource::TaskPrompt)
.with_when(WhenCondition::new("subject", PredicateRule::Matches)
.with_value(YamlValue::String("^Re: ".to_string()))),
);
let compiled = compile_rulespec(&rulespec, "test", 1).unwrap();
// Non-reply email → when not met → skip
let mut envelope = ActionEnvelope::new();
envelope.add_fact("subject", YamlValue::String("Hello World".to_string()));
// No reply_to_message_id
let facts = extract_facts(&envelope, &compiled);
let result = execute_rules(&compiled, &facts);
assert!(result.all_passed(), "Non-reply should get vacuous pass");
assert!(result.predicate_results[0].reason.contains("Skipped"),
"Should be skipped for non-reply");
}
// ========================================================================
// format_datalog_program for new rules
// ========================================================================
#[test]
fn test_format_datalog_program_new_rules() {
let mut rulespec = Rulespec::new();
rulespec.claims.push(Claim::new("caps", "feature.capabilities"));
rulespec.claims.push(Claim::new("format", "feature.output_format"));
rulespec.predicates.push(
Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("deprecated".to_string())),
);
rulespec.predicates.push(
Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("json".to_string()),
YamlValue::String("yaml".to_string()),
])),
);
rulespec.predicates.push(
Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("xml".to_string()),
])),
);
let compiled = compile_rulespec(&rulespec, "test-new-rules", 1).unwrap();
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a, b]\noutput_format: json").unwrap());
let facts = extract_facts(&envelope, &compiled);
let dl = format_datalog_program(&compiled, &facts);
// Verify header
assert!(dl.contains("// Plan: test-new-rules"));
// Verify not_contains rule
assert!(dl.contains("not_contains"), "Should contain not_contains rule comment");
assert!(dl.contains("predicate_pass(0)"));
// Verify any_of rule
assert!(dl.contains("any_of"), "Should contain any_of rule");
assert!(dl.contains("predicate_pass(1)"));
// Verify none_of rule
assert!(dl.contains("none_of"), "Should contain none_of rule");
assert!(dl.contains("predicate_pass(2)"));
// Verify failure derivation for all
assert!(dl.contains("predicate_fail(0) :- !predicate_pass(0)."));
assert!(dl.contains("predicate_fail(1) :- !predicate_pass(1)."));
assert!(dl.contains("predicate_fail(2) :- !predicate_pass(2)."));
}
} }

View File

@@ -103,6 +103,12 @@ pub enum PredicateRule {
MaxLength, MaxLength,
/// Value matches a regex pattern /// Value matches a regex pattern
Matches, Matches,
/// Value does NOT contain the specified element (negation of Contains)
NotContains,
/// Value is one of the specified set of values
AnyOf,
/// Value is none of the specified set of values
NoneOf,
} }
impl std::fmt::Display for PredicateRule { impl std::fmt::Display for PredicateRule {
@@ -117,10 +123,40 @@ impl std::fmt::Display for PredicateRule {
PredicateRule::MinLength => write!(f, "min_length"), PredicateRule::MinLength => write!(f, "min_length"),
PredicateRule::MaxLength => write!(f, "max_length"), PredicateRule::MaxLength => write!(f, "max_length"),
PredicateRule::Matches => write!(f, "matches"), PredicateRule::Matches => write!(f, "matches"),
PredicateRule::NotContains => write!(f, "not_contains"),
PredicateRule::AnyOf => write!(f, "any_of"),
PredicateRule::NoneOf => write!(f, "none_of"),
} }
} }
} }
/// A predicate defines a rule to evaluate against a claim's value.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WhenCondition {
/// Name of the claim to check for the condition
pub claim: String,
/// The rule to apply for the condition check
pub rule: PredicateRule,
/// Value to compare against (optional, depends on rule)
#[serde(default, skip_serializing_if = "Option::is_none")]
pub value: Option<YamlValue>,
}
impl WhenCondition {
pub fn new(claim: impl Into<String>, rule: PredicateRule) -> Self {
Self {
claim: claim.into(),
rule,
value: None,
}
}
pub fn with_value(mut self, value: YamlValue) -> Self {
self.value = Some(value);
self
}
}
/// A predicate defines a rule to evaluate against a claim's value. /// A predicate defines a rule to evaluate against a claim's value.
#[derive(Debug, Clone, Serialize, Deserialize)] #[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Predicate { pub struct Predicate {
@@ -137,6 +173,10 @@ pub struct Predicate {
/// Optional notes explaining the invariant or providing nuance /// Optional notes explaining the invariant or providing nuance
#[serde(default, skip_serializing_if = "Option::is_none")] #[serde(default, skip_serializing_if = "Option::is_none")]
pub notes: Option<String>, pub notes: Option<String>,
/// Optional condition that must be met for this predicate to be evaluated.
/// If the condition is not met, the predicate is skipped (vacuous pass).
#[serde(default, skip_serializing_if = "Option::is_none")]
pub when: Option<WhenCondition>,
} }
impl Predicate { impl Predicate {
@@ -151,6 +191,7 @@ impl Predicate {
value: None, value: None,
source, source,
notes: None, notes: None,
when: None,
} }
} }
@@ -164,6 +205,11 @@ impl Predicate {
self self
} }
pub fn with_when(mut self, when: WhenCondition) -> Self {
self.when = Some(when);
self
}
/// Validate the predicate structure. /// Validate the predicate structure.
pub fn validate(&self) -> Result<()> { pub fn validate(&self) -> Result<()> {
if self.claim.trim().is_empty() { if self.claim.trim().is_empty() {
@@ -185,6 +231,25 @@ impl Predicate {
} }
} }
// Validate when condition if present
if let Some(when) = &self.when {
if when.claim.trim().is_empty() {
return Err(anyhow!("When condition claim reference cannot be empty"));
}
// When conditions that need a value must have one
match when.rule {
PredicateRule::Exists | PredicateRule::NotExists => {}
_ => {
if when.value.is_none() {
return Err(anyhow!(
"When condition rule '{}' requires a value",
when.rule
));
}
}
}
}
Ok(()) Ok(())
} }
} }
@@ -233,6 +298,15 @@ impl Rulespec {
predicate.claim predicate.claim
)); ));
} }
// Validate when condition claim reference
if let Some(when) = &predicate.when {
if !claim_names.contains(&when.claim) {
return Err(anyhow!(
"When condition references unknown claim: {}",
when.claim
));
}
}
} }
Ok(()) Ok(())
@@ -481,14 +555,22 @@ pub fn evaluate_predicate(
) -> PredicateResult { ) -> PredicateResult {
match predicate.rule { match predicate.rule {
PredicateRule::Exists => { PredicateRule::Exists => {
if selected_values.is_empty() { // Filter out null values — null means "absent"
let non_null: Vec<_> = selected_values.iter()
.filter(|v| !v.is_null())
.collect();
if non_null.is_empty() {
PredicateResult::fail("Value does not exist") PredicateResult::fail("Value does not exist")
} else { } else {
PredicateResult::pass("Value exists") PredicateResult::pass("Value exists")
} }
} }
PredicateRule::NotExists => { PredicateRule::NotExists => {
if selected_values.is_empty() { // Filter out null values — null means "absent"
let non_null: Vec<_> = selected_values.iter()
.filter(|v| !v.is_null())
.collect();
if non_null.is_empty() {
PredicateResult::pass("Value does not exist as expected") PredicateResult::pass("Value does not exist as expected")
} else { } else {
PredicateResult::fail("Value exists but should not") PredicateResult::fail("Value exists but should not")
@@ -642,6 +724,65 @@ pub fn evaluate_predicate(
} }
PredicateResult::fail(format!("No value matches pattern '{}'", pattern)) PredicateResult::fail(format!("No value matches pattern '{}'", pattern))
} }
PredicateRule::NotContains => {
let target = match &predicate.value {
Some(v) => v,
None => return PredicateResult::fail("No value specified for not_contains"),
};
for value in selected_values {
if value_contains(value, target) {
return PredicateResult::fail(format!(
"Value contains {:?} but should not",
yaml_to_display(target)
));
}
}
PredicateResult::pass(format!(
"Value does not contain {:?}",
yaml_to_display(target)
))
}
PredicateRule::AnyOf => {
let allowed = match &predicate.value {
Some(YamlValue::Sequence(seq)) => seq,
Some(_) => return PredicateResult::fail("any_of requires an array value"),
None => return PredicateResult::fail("No value specified for any_of"),
};
for value in selected_values {
if allowed.contains(value) {
return PredicateResult::pass(format!(
"Value {:?} is in allowed set",
yaml_to_display(value)
));
}
}
PredicateResult::fail(format!(
"Value is not in allowed set [{}]",
allowed.iter().map(yaml_to_display).collect::<Vec<_>>().join(", ")
))
}
PredicateRule::NoneOf => {
let forbidden = match &predicate.value {
Some(YamlValue::Sequence(seq)) => seq,
Some(_) => return PredicateResult::fail("none_of requires an array value"),
None => return PredicateResult::fail("No value specified for none_of"),
};
for value in selected_values {
if forbidden.contains(value) {
return PredicateResult::fail(format!(
"Value {:?} is in forbidden set",
yaml_to_display(value)
));
}
}
PredicateResult::pass(format!(
"Value is not in forbidden set [{}]",
forbidden.iter().map(yaml_to_display).collect::<Vec<_>>().join(", ")
))
}
} }
} }
@@ -785,6 +926,41 @@ pub fn evaluate_rulespec(rulespec: &Rulespec, envelope: &ActionEnvelope) -> Rule
.collect(); .collect();
for predicate in &rulespec.predicates { for predicate in &rulespec.predicates {
// Check when condition — if present and not met, skip (vacuous pass)
if let Some(when) = &predicate.when {
let when_claim = claims.get(when.claim.as_str());
let when_met = match when_claim {
Some(claim) => {
match Selector::parse(&claim.selector) {
Ok(selector) => {
let when_values = selector.select(&envelope_value);
let when_pred = Predicate {
claim: when.claim.clone(),
rule: when.rule.clone(),
value: when.value.clone(),
source: predicate.source,
notes: None,
when: None,
};
evaluate_predicate(&when_pred, &when_values).passed
}
Err(_) => false,
}
}
None => false,
};
if !when_met {
passed_count += 1;
predicate_results.push(PredicateEvaluation {
predicate: predicate.clone(),
claim_name: predicate.claim.clone(),
selected_values: vec![],
result: PredicateResult::pass("Skipped (when condition not met)"),
});
continue;
}
}
let claim = claims.get(predicate.claim.as_str()); let claim = claims.get(predicate.claim.as_str());
let (selected_values, result) = match claim { let (selected_values, result) = match claim {
@@ -1446,4 +1622,411 @@ facts:
assert_eq!(parsed.facts.get("breaking_changes").unwrap(), &YamlValue::Null); assert_eq!(parsed.facts.get("breaking_changes").unwrap(), &YamlValue::Null);
assert_eq!(parsed.facts.get("feature").unwrap(), &YamlValue::String("done".to_string())); assert_eq!(parsed.facts.get("feature").unwrap(), &YamlValue::String("done".to_string()));
} }
// ========================================================================
// New Predicate Rules: NotContains, AnyOf, NoneOf
// ========================================================================
#[test]
fn test_predicate_not_contains_array_pass() {
let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("deprecated".to_string()));
let array = YamlValue::Sequence(vec![
YamlValue::String("handle_csv".to_string()),
YamlValue::String("handle_tsv".to_string()),
]);
let result = evaluate_predicate(&predicate, &[array]);
assert!(result.passed, "not_contains should pass when element is absent");
}
#[test]
fn test_predicate_not_contains_array_fail() {
let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("handle_csv".to_string()));
let array = YamlValue::Sequence(vec![
YamlValue::String("handle_csv".to_string()),
YamlValue::String("handle_tsv".to_string()),
]);
let result = evaluate_predicate(&predicate, &[array]);
assert!(!result.passed, "not_contains should fail when element is present");
}
#[test]
fn test_predicate_not_contains_string_pass() {
let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("xml".to_string()));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("csv_importer".to_string())],
);
assert!(result.passed, "not_contains should pass when substring is absent");
}
#[test]
fn test_predicate_not_contains_string_fail() {
let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("csv".to_string()));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("csv_importer".to_string())],
);
assert!(!result.passed, "not_contains should fail when substring is present");
}
#[test]
fn test_predicate_not_contains_empty_array() {
let predicate = Predicate::new("test", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("anything".to_string()));
let array = YamlValue::Sequence(vec![]);
let result = evaluate_predicate(&predicate, &[array]);
assert!(result.passed, "not_contains on empty array should pass");
}
#[test]
fn test_predicate_any_of_pass() {
let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("json".to_string()),
YamlValue::String("yaml".to_string()),
YamlValue::String("toml".to_string()),
]));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("yaml".to_string())],
);
assert!(result.passed, "any_of should pass when value is in set");
}
#[test]
fn test_predicate_any_of_fail() {
let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("json".to_string()),
YamlValue::String("yaml".to_string()),
]));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("xml".to_string())],
);
assert!(!result.passed, "any_of should fail when value is not in set");
}
#[test]
fn test_predicate_any_of_non_array_value_fails() {
let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("not_an_array".to_string()));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("anything".to_string())],
);
assert!(!result.passed, "any_of with non-array value should fail");
}
#[test]
fn test_predicate_any_of_single_element() {
let predicate = Predicate::new("test", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("only".to_string()),
]));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("only".to_string())],
);
assert!(result.passed, "any_of with single-element set should work");
}
#[test]
fn test_predicate_none_of_pass() {
let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("xml".to_string()),
YamlValue::String("csv".to_string()),
]));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("json".to_string())],
);
assert!(result.passed, "none_of should pass when value is not in forbidden set");
}
#[test]
fn test_predicate_none_of_fail() {
let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("xml".to_string()),
YamlValue::String("csv".to_string()),
]));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("xml".to_string())],
);
assert!(!result.passed, "none_of should fail when value is in forbidden set");
}
#[test]
fn test_predicate_none_of_non_array_value_fails() {
let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("not_an_array".to_string()));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("anything".to_string())],
);
assert!(!result.passed, "none_of with non-array value should fail");
}
#[test]
fn test_predicate_none_of_empty_forbidden_set() {
let predicate = Predicate::new("test", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![]));
let result = evaluate_predicate(
&predicate,
&[YamlValue::String("anything".to_string())],
);
assert!(result.passed, "none_of with empty forbidden set should pass");
}
// ========================================================================
// Null Handling Tests
// ========================================================================
#[test]
fn test_predicate_exists_fails_for_null() {
let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt);
let result = evaluate_predicate(&predicate, &[YamlValue::Null]);
assert!(!result.passed, "exists should fail for null value");
}
#[test]
fn test_predicate_not_exists_passes_for_null() {
let predicate = Predicate::new("test", PredicateRule::NotExists, InvariantSource::TaskPrompt);
let result = evaluate_predicate(&predicate, &[YamlValue::Null]);
assert!(result.passed, "not_exists should pass for null value");
}
#[test]
fn test_predicate_exists_passes_for_empty_string() {
let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt);
let result = evaluate_predicate(&predicate, &[YamlValue::String(String::new())]);
assert!(result.passed, "exists should pass for empty string (not null)");
}
#[test]
fn test_predicate_exists_passes_for_empty_array() {
let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt);
let result = evaluate_predicate(&predicate, &[YamlValue::Sequence(vec![])]);
assert!(result.passed, "exists should pass for empty array (not null)");
}
#[test]
fn test_predicate_contains_on_null_fails() {
let predicate = Predicate::new("test", PredicateRule::Contains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("x".to_string()));
let result = evaluate_predicate(&predicate, &[YamlValue::Null]);
assert!(!result.passed, "contains on null should fail");
}
#[test]
fn test_predicate_exists_with_mixed_null_and_value() {
let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt);
// If selected_values has both null and a real value, exists should pass
let result = evaluate_predicate(
&predicate,
&[YamlValue::Null, YamlValue::String("real".to_string())],
);
assert!(result.passed, "exists should pass when at least one non-null value");
}
#[test]
fn test_predicate_not_exists_fails_with_mixed_null_and_value() {
let predicate = Predicate::new("test", PredicateRule::NotExists, InvariantSource::TaskPrompt);
let result = evaluate_predicate(
&predicate,
&[YamlValue::Null, YamlValue::String("real".to_string())],
);
assert!(!result.passed, "not_exists should fail when at least one non-null value");
}
// ========================================================================
// When Condition Tests
// ========================================================================
#[test]
fn test_when_condition_met_evaluates_predicate() {
let mut rulespec = Rulespec::new();
rulespec.add_claim(Claim::new("is_breaking", "api_changes.breaking"));
rulespec.add_claim(Claim::new("caps", "feature.capabilities"));
rulespec.add_predicate(
Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt)
.with_value(YamlValue::Number(2.into()))
.with_when(WhenCondition::new("is_breaking", PredicateRule::Equals)
.with_value(YamlValue::Bool(true)))
);
let mut envelope = ActionEnvelope::new();
envelope.add_fact("api_changes", serde_yaml::from_str("breaking: true").unwrap());
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a, b, c]").unwrap());
let eval = evaluate_rulespec(&rulespec, &envelope);
assert_eq!(eval.passed_count, 1);
assert_eq!(eval.failed_count, 0);
}
#[test]
fn test_when_condition_not_met_vacuous_pass() {
let mut rulespec = Rulespec::new();
rulespec.add_claim(Claim::new("is_breaking", "api_changes.breaking"));
rulespec.add_claim(Claim::new("caps", "feature.capabilities"));
rulespec.add_predicate(
Predicate::new("caps", PredicateRule::MinLength, InvariantSource::TaskPrompt)
.with_value(YamlValue::Number(100.into())) // would fail if evaluated
.with_when(WhenCondition::new("is_breaking", PredicateRule::Equals)
.with_value(YamlValue::Bool(true)))
);
let mut envelope = ActionEnvelope::new();
envelope.add_fact("api_changes", serde_yaml::from_str("breaking: false").unwrap());
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a]").unwrap());
let eval = evaluate_rulespec(&rulespec, &envelope);
assert_eq!(eval.passed_count, 1, "When not met should be vacuous pass");
assert_eq!(eval.failed_count, 0);
assert!(eval.predicate_results[0].result.reason.contains("Skipped"));
}
#[test]
fn test_when_condition_with_exists() {
let mut rulespec = Rulespec::new();
rulespec.add_claim(Claim::new("has_tests", "feature.tests"));
rulespec.add_claim(Claim::new("coverage", "feature.coverage"));
rulespec.add_predicate(
Predicate::new("coverage", PredicateRule::GreaterThan, InvariantSource::TaskPrompt)
.with_value(YamlValue::Number(80.into()))
.with_when(WhenCondition::new("has_tests", PredicateRule::Exists))
);
// No tests field → when condition not met → vacuous pass
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("coverage: 50").unwrap());
let eval = evaluate_rulespec(&rulespec, &envelope);
assert_eq!(eval.passed_count, 1, "When exists not met should skip");
assert_eq!(eval.failed_count, 0);
}
#[test]
fn test_when_unknown_claim_fails_validation() {
let mut rulespec = Rulespec::new();
rulespec.add_claim(Claim::new("caps", "feature.capabilities"));
rulespec.add_predicate(
Predicate::new("caps", PredicateRule::Exists, InvariantSource::TaskPrompt)
.with_when(WhenCondition::new("nonexistent", PredicateRule::Exists))
);
let result = rulespec.validate();
assert!(result.is_err(), "When referencing unknown claim should fail validation");
assert!(result.unwrap_err().to_string().contains("unknown claim"));
}
#[test]
fn test_predicate_without_when_always_evaluated() {
// Backward compatibility: no when field means always evaluated
let mut rulespec = Rulespec::new();
rulespec.add_claim(Claim::new("caps", "feature.capabilities"));
rulespec.add_predicate(
Predicate::new("caps", PredicateRule::Exists, InvariantSource::TaskPrompt)
);
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str("capabilities: [a]").unwrap());
let eval = evaluate_rulespec(&rulespec, &envelope);
assert_eq!(eval.passed_count, 1);
assert_eq!(eval.failed_count, 0);
}
#[test]
fn test_when_condition_validate_requires_value() {
let when = WhenCondition::new("test", PredicateRule::Equals);
// No value set — validation should catch this
let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt)
.with_when(when);
let result = predicate.validate();
assert!(result.is_err(), "When condition with equals but no value should fail");
}
#[test]
fn test_when_condition_exists_no_value_ok() {
let when = WhenCondition::new("test", PredicateRule::Exists);
let predicate = Predicate::new("test", PredicateRule::Exists, InvariantSource::TaskPrompt)
.with_when(when);
let result = predicate.validate();
assert!(result.is_ok(), "When condition with exists and no value should be ok");
}
#[test]
fn test_evaluate_rulespec_with_new_rules_full() {
let mut rulespec = Rulespec::new();
rulespec.add_claim(Claim::new("caps", "feature.capabilities"));
rulespec.add_claim(Claim::new("format", "feature.output_format"));
rulespec.add_claim(Claim::new("breaking", "breaking_changes"));
// not_contains: must not have deprecated
rulespec.add_predicate(
Predicate::new("caps", PredicateRule::NotContains, InvariantSource::TaskPrompt)
.with_value(YamlValue::String("deprecated".to_string()))
);
// any_of: format must be json or yaml
rulespec.add_predicate(
Predicate::new("format", PredicateRule::AnyOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("json".to_string()),
YamlValue::String("yaml".to_string()),
]))
);
// none_of: format must not be xml or csv
rulespec.add_predicate(
Predicate::new("format", PredicateRule::NoneOf, InvariantSource::TaskPrompt)
.with_value(YamlValue::Sequence(vec![
YamlValue::String("xml".to_string()),
YamlValue::String("csv".to_string()),
]))
);
// not_exists: no breaking changes
rulespec.add_predicate(
Predicate::new("breaking", PredicateRule::NotExists, InvariantSource::TaskPrompt)
);
let mut envelope = ActionEnvelope::new();
envelope.add_fact("feature", serde_yaml::from_str(r#"
capabilities: [handle_csv, handle_tsv]
output_format: json
"#).unwrap());
envelope.add_fact("breaking_changes", YamlValue::Null);
let eval = evaluate_rulespec(&rulespec, &envelope);
assert_eq!(eval.passed_count, 4, "All 4 predicates should pass");
assert_eq!(eval.failed_count, 0);
}
} }

View File

@@ -0,0 +1,402 @@
# Rulespec YAML Schema
> Canonical reference for `analysis/rulespec.yaml` — the machine-readable invariant specification.
## Overview
A rulespec defines **claims** (selectors into the action envelope) and **predicates** (rules that evaluate those claims). When an agent completes work, it writes an **action envelope** via `write_envelope`, and the rulespec is evaluated against it using datalog verification.
## File Location
```
analysis/rulespec.yaml # checked into the repo
```
## Top-Level Structure
```yaml
claims:
- name: <string> # unique identifier for this claim
selector: <string> # path into the action envelope
predicates:
- claim: <string> # references a claim by name
rule: <rule_type> # one of the 12 rule types below
value: <any> # required for most rules (see table)
source: <source> # task_prompt | memory
notes: <string> # optional human-readable explanation
when: # optional conditional trigger
claim: <string> # references a claim by name
rule: <rule_type> # condition rule
value: <any> # condition value (if needed)
```
## Claims
A claim is a named selector that extracts values from the action envelope.
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `name` | string | ✅ | Unique identifier (referenced by predicates) |
| `selector` | string | ✅ | Dot-notation path into the envelope |
### Selector Syntax
| Syntax | Meaning | Example |
|--------|---------|--------|
| `foo.bar` | Nested field access | `csv_importer.capabilities` |
| `foo[0]` | Array index (0-based) | `tests[0].name` |
| `foo[*]` | All array elements (wildcard) | `items[*].id` |
| `foo.bar.baz` | Deep nesting | `api.endpoints.count` |
**Important**: Selectors operate on the envelope's `facts` content directly. Do NOT include a `facts.` prefix in selectors — the system handles this automatically.
```yaml
# ✅ Correct
claims:
- name: caps
selector: csv_importer.capabilities
# ❌ Wrong — don't prefix with "facts."
claims:
- name: caps
selector: facts.csv_importer.capabilities
```
## Predicate Rules
### Rule Types Reference
| Rule | Value Required | Value Type | Description |
|------|---------------|------------|-------------|
| `exists` | ❌ | — | Value exists and is not null |
| `not_exists` | ❌ | — | Value is null or missing |
| `equals` | ✅ | any | Value equals exactly |
| `contains` | ✅ | any | Array contains element, or string contains substring |
| `not_contains` | ✅ | any | Negation of `contains` |
| `any_of` | ✅ | array | Value is one of the specified set |
| `none_of` | ✅ | array | Value is none of the specified set |
| `greater_than` | ✅ | number | Numeric comparison |
| `less_than` | ✅ | number | Numeric comparison |
| `min_length` | ✅ | number | Array has at least N elements |
| `max_length` | ✅ | number | Array has at most N elements |
| `matches` | ✅ | string | Value matches a regex pattern |
### Existence Rules
```yaml
# Value must exist (not null, not missing)
predicates:
- claim: feature_file
rule: exists
source: task_prompt
# Value must NOT exist (null or missing)
predicates:
- claim: breaking_changes
rule: not_exists
source: task_prompt
notes: "No breaking changes allowed"
```
### Equality
```yaml
predicates:
- claim: api_breaking
rule: equals
value: false
source: task_prompt
```
### Containment
```yaml
# Array contains an element
predicates:
- claim: capabilities
rule: contains
value: "handle_csv"
source: task_prompt
# Array must NOT contain an element
predicates:
- claim: capabilities
rule: not_contains
value: "deprecated_feature"
source: task_prompt
```
### Set Membership
```yaml
# Value must be one of these
predicates:
- claim: output_format
rule: any_of
value: [json, yaml, toml]
source: task_prompt
# Value must NOT be any of these
predicates:
- claim: output_format
rule: none_of
value: [xml, csv]
source: task_prompt
```
### Numeric Comparisons
```yaml
predicates:
- claim: test_count
rule: greater_than
value: 0
source: task_prompt
notes: "Must have at least one test"
- claim: error_rate
rule: less_than
value: 5
source: memory
```
### Array Length
```yaml
predicates:
- claim: capabilities
rule: min_length
value: 2
source: task_prompt
- claim: dependencies
rule: max_length
value: 10
source: memory
notes: "Keep dependency count manageable"
```
### Regex Matching
```yaml
predicates:
- claim: file_path
rule: matches
value: "^src/.*\\.rs$"
source: task_prompt
notes: "File must be a Rust source file in src/"
```
## Conditional Predicates (`when`)
Predicates can have an optional `when` condition. If the condition is **not met**, the predicate is **skipped** (vacuous pass) — it does not fail.
This is useful for rules that only apply in certain contexts.
### When Condition Structure
```yaml
when:
claim: <string> # references a defined claim
rule: <rule_type> # any predicate rule type
value: <any> # optional, depends on rule
```
### Examples
```yaml
# Only enforce endpoint count when there are breaking changes
predicates:
- claim: api_endpoints
rule: min_length
value: 3
source: task_prompt
when:
claim: is_breaking
rule: equals
value: true
notes: "Breaking changes must document all endpoints"
# Only check test coverage when tests exist
predicates:
- claim: coverage_percent
rule: greater_than
value: 80
source: memory
when:
claim: has_tests
rule: exists
# Only enforce format when feature is present
predicates:
- claim: output_format
rule: any_of
value: [json, yaml]
source: task_prompt
when:
claim: has_output
rule: exists
```
### When with Regex Matching
```yaml
# Only require reply_to_message_id when subject starts with "Re: "
predicates:
- claim: reply_to_id
rule: exists
source: task_prompt
when:
claim: subject_line
rule: matches
value: "^Re: "
notes: Reply emails must have reply_to_message_id set
```
## Null Handling
Null values in the action envelope have specific semantics:
- **`null` is treated as absent** — `exists` returns false, `not_exists` returns true
- This applies to both the invariants evaluator and the datalog compiler
- A fact with value `null` produces **no datalog facts** (it is skipped entirely)
### Common Pattern: Asserting Absence
```yaml
# In the envelope:
facts:
breaking_changes: null # explicitly absent
# In the rulespec:
claims:
- name: breaking
selector: breaking_changes
predicates:
- claim: breaking
rule: not_exists
source: task_prompt
notes: "No breaking changes" # ✅ This passes
```
### Edge Cases
| Envelope Value | `exists` | `not_exists` | `contains "x"` | `equals "y"` |
|---------------|----------|-------------|----------------|-------------|
| `null` | ❌ fail | ✅ pass | ❌ fail | ❌ fail |
| missing key | ❌ fail | ✅ pass | ❌ fail | ❌ fail |
| `""` (empty string) | ✅ pass | ❌ fail | ❌ fail | ❌ fail |
| `[]` (empty array) | ✅ pass | ❌ fail | ❌ fail | ❌ fail |
| `0` | ✅ pass | ❌ fail | ❌ fail | depends |
## Action Envelope Format
The action envelope is written via the `write_envelope` tool. It must have a top-level `facts:` key.
```yaml
facts:
feature_name:
capabilities: [cap_a, cap_b]
file: "src/feature.rs"
tests: ["test_a", "test_b"]
api_changes:
breaking: false
breaking_changes: null # Use null to assert absence
```
### Rules for Envelope Facts
1. **Must have `facts:` top-level key** — without it, the envelope is empty
2. **Use file paths as evidence**`"src/foo.rs"`, `"src/foo.rs:42"`
3. **Use `null` for explicit absence** — triggers `not_exists` predicates
4. **Arrays for lists** — capabilities, tests, endpoints
5. **Nested objects for grouping**`feature.capabilities`, `feature.file`
## Complete Example
```yaml
# analysis/rulespec.yaml
claims:
- name: caps
selector: csv_importer.capabilities
- name: file
selector: csv_importer.file
- name: tests
selector: csv_importer.tests
- name: breaking
selector: api_changes.breaking
- name: no_breaking
selector: breaking_changes
predicates:
# Must have capabilities
- claim: caps
rule: exists
source: task_prompt
# Must include handle_csv
- claim: caps
rule: contains
value: "handle_csv"
source: task_prompt
# Must NOT include deprecated features
- claim: caps
rule: not_contains
value: "legacy_parser"
source: memory
# At least 2 capabilities
- claim: caps
rule: min_length
value: 2
source: task_prompt
# File must be a Rust source
- claim: file
rule: matches
value: "^src/.*\\.rs$"
source: task_prompt
# Must have tests
- claim: tests
rule: min_length
value: 1
source: task_prompt
# No breaking changes
- claim: no_breaking
rule: not_exists
source: task_prompt
# If breaking, must document it
- claim: caps
rule: contains
value: "migration_guide"
source: task_prompt
when:
claim: breaking
rule: equals
value: true
notes: "Breaking changes require a migration guide capability"
```
## Verification Pipeline
1. Agent calls `write_envelope` with facts YAML
2. System writes `envelope.yaml` to session directory
3. System reads `analysis/rulespec.yaml` from working directory
4. Rulespec is compiled into datalog relations
5. Facts are extracted from envelope using claim selectors
6. Datalog rules are executed to fixed point
7. Results are written to `rulespec.compiled.dl` and `datalog_evaluation.txt`
8. Summary is returned to the agent
9. On plan completion, `plan_verify()` checks that the envelope exists
## Source Types
| Source | Meaning |
|--------|---------|
| `task_prompt` | Invariant derived from the user's task description |
| `memory` | Invariant derived from workspace memory (AGENTS.md, memory.md) |

View File

@@ -81,6 +81,7 @@ When marking done, add `evidence` and `notes` to the item.
Before marking the last plan item done, call `write_envelope` with facts about completed work. The envelope captures what was actually built so it can be verified against invariants in `analysis/rulespec.yaml` if present. The tool writes the envelope and runs datalog verification automatically. Before marking the last plan item done, call `write_envelope` with facts about completed work. The envelope captures what was actually built so it can be verified against invariants in `analysis/rulespec.yaml` if present. The tool writes the envelope and runs datalog verification automatically.
```yaml ```yaml
type: code_change
facts: facts:
csv_importer: csv_importer:
capabilities: [handle_headers, handle_tsv, handle_quoted_fields] capabilities: [handle_headers, handle_tsv, handle_quoted_fields]