Commit Graph

18 Commits

Author SHA1 Message Date
Dhanji R. Prasanna
473fd9d942 Fix backticks in project yaml read error 2026-04-08 11:09:59 +10:00
Dhanji R. Prasanna
7347d92ae8 Make plan approval gate non-destructive and baseline-aware
- Remove all file revert/delete logic from check_plan_approval_gate:
  no more git checkout or fs::remove_file calls. The gate only warns.
- Remove reverted_files field from ApprovalGateResult::Blocked.
- Add get_dirty_files() helper to snapshot dirty files as a HashSet.
- Capture baseline dirty files when plan mode starts (set_plan_mode).
  Pre-existing dirty files are excluded from gate checks so they
  never trigger blocking.
- Add 5 new unit tests covering non-destructive behavior, baseline
  exclusion, and mixed baseline/new file scenarios.
- Update integration test to match new non-destructive semantics.
2026-02-15 09:53:14 +11:00
Dhanji R. Prasanna
1d77f3f865 fix: allow new plan_write after completed approved plan
When an approved plan was fully complete (all items done/blocked),
plan_write blocked creating a new plan with 'Cannot remove item'
error. Now checks is_complete() first — complete plans allow fresh
plan creation without carrying over approved_revision or enforcing
item ID preservation.

Adds 4 end-to-end integration tests covering happy path, negative
(in-progress still blocks), and boundary cases (all-blocked, mixed).
2026-02-14 12:27:38 +11:00
Dhanji R. Prasanna
7032e75fc6 Add write_envelope tool with verify_envelope for explicit envelope creation
- New crates/g3-core/src/tools/envelope.rs with execute_write_envelope()
  and verify_envelope() (moved from shadow_datalog_verify in plan.rs)
- write_envelope accepts YAML facts, writes envelope.yaml to session dir,
  then runs datalog verification against analysis/rulespec.yaml in shadow mode
- plan_verify() now only checks envelope existence (no longer runs datalog)
- Tool count: 13 -> 14
- Updated system prompt to instruct agents to call write_envelope before
  marking last plan item done
- Updated integration tests to use write_envelope tool directly

Workflow: write_envelope -> verify_envelope -> datalog shadow artifacts
          plan_write(done) -> plan_verify -> checks envelope exists
2026-02-06 16:09:07 +11:00
Dhanji R. Prasanna
f7a240a99b refactor: decouple rulespec from plan_write, read from analysis/rulespec.yaml
- Remove rulespec parameter from plan_write tool definition and execution
- Remove rulespec compilation from plan_approve (no longer pre-compiles)
- Remove write_rulespec, get_rulespec_path, format_rulespec_yaml/markdown
  from invariants.rs; read_rulespec() now takes &Path working dir
- Remove save/load_compiled_rulespec, get_compiled_rulespec_path from datalog.rs
- Update shadow_datalog_verify() to compile on-the-fly from
  analysis/rulespec.yaml, writing rulespec.compiled.dl and
  datalog_evaluation.txt to session dir
- Remove rulespec display from plan_read output
- Remove Invariants/Rulespec section from native.md system prompt
- Remove rulespec from prompts.rs plan_write format and examples
- Update existing tests to remove rulespec from plan_write calls
- Add 3 integration tests for on-the-fly rulespec verification
2026-02-06 15:31:23 +11:00
Dhanji R. Prasanna
abfac197ab Add datalog-based invariant verification system
Implement a new datalog verification layer using datafrog that:

- Compiles rulespec to datalog on plan_approve
- Extracts facts from action envelope using selectors
- Executes datalog rules on plan_verify
- Writes evaluation results to datalog_evaluation.txt (shadow mode)

Key components:
- crates/g3-core/src/tools/datalog.rs: Full datalog module with:
  - compile_rulespec(): Validates and compiles rulespec
  - extract_facts(): Extracts facts from envelope YAML
  - execute_rules(): Runs datafrog iteration
  - 23 comprehensive tests

- crates/g3-core/src/tools/plan.rs:
  - execute_plan_approve(): Now compiles rulespec on approval
  - shadow_datalog_verify(): Runs datalog and writes to eval file

Results are written to .g3/sessions/<id>/datalog_evaluation.txt
for inspection, NOT injected into context window (shadow mode).
2026-02-06 13:50:54 +11:00
Dhanji R. Prasanna
7e2d9bc22c Enforce rulespec creation with plan_write for new plans
Solves the tautology problem where the LLM would write invariants after
implementation, making them match what was done rather than constrain it.

Changes:
- plan_write now accepts 'rulespec' parameter
- New plans REQUIRE rulespec (fails with helpful error if missing)
- Plan updates don't require rulespec (backward compatible)
- Rulespec is parsed, validated, and written atomically with plan
- Updated system prompt with clear examples for new vs update
- Updated tool definition schema
- Updated all affected tests

New flow: task → plan+rulespec → user reviews BOTH → approve → implement
2026-02-05 21:12:02 +11:00
Dhanji R. Prasanna
b2fbcf33d0 Fix plan approval gate and add "Create a plan:" prefix for first message
- Fix build warnings: add #[allow(dead_code)] to unused deserialization fields
- Fix plan approval gate bug: block file changes when no plan exists (not just
  when plan exists but is unapproved)
- Add "Create a plan: " prefix to first user message in plan mode
- Add prepare_plan_mode_input() helper function for testability
- Reset is_first_plan_message flag when entering plan mode via /plan command
- Add tests for approval gate (no plan + no changes, no plan + changes)
- Add tests for prepare_plan_mode_input (happy, negative, boundary cases)
2026-02-05 19:43:38 +11:00
Dhanji R. Prasanna
06d75f613c feat(plan): display rulespec.yaml and envelope.yaml in plan_read/plan_write output
- Add format_envelope_markdown() function in invariants.rs for rich markdown
  formatting of ActionEnvelope facts
- Add format_yaml_value_markdown() helper for recursive YAML value display
- Update execute_plan_read() to append rulespec and envelope sections
- Update execute_plan_write() to append envelope section alongside rulespec
- Add 3 tests for format_envelope_markdown (empty, with facts, null values)

When plan_read or plan_write is called, the output now includes:
- Plan YAML (as before)
- Rulespec section (if rulespec.yaml exists) with invariants grouped by source
- Envelope section (if envelope.yaml exists) with facts in readable format

Missing files show placeholder text rather than errors.
2026-02-05 19:08:55 +11:00
Dhanji R. Prasanna
3d284b8b60 Merge sessions/interactive/179ac8a6 2026-02-05 11:37:07 +11:00
Dhanji R. Prasanna
1f1a517620 feat(plan): support multiple negative and boundary checks
Change Plan Mode to allow multiple negative and boundary checks per item,
while keeping happy path as a single check.

Schema change:
- checks.negative: Check -> Vec<Check> (>=1 required)
- checks.boundary: Check -> Vec<Check> (>=1 required)
- checks.happy: Check (unchanged, single)

This better reflects real-world tasks where there are often multiple
error conditions and edge cases worth tracking.

Changes:
- Update Checks struct to use Vec<Check> for negative/boundary
- Update validation to require at least 1 of each
- Update prompts and tool definitions with new array syntax
- Add 4 new tests for multi-check scenarios
2026-02-05 11:36:45 +11:00
Dhanji R. Prasanna
c347a73cbd Add plan approval gate to block file changes without approved plan
- Add check_plan_approval_gate() in tools/plan.rs that runs after each tool call
- Detects file changes via git status --porcelain when plan exists but not approved
- Reverts changes: git checkout for modified files, rm for new untracked files
- Returns blocking message instructing LLM to create/approve plan first
- Add ApprovalGateResult enum with Allowed/Blocked/NotGitRepo variants
- Add set_session_id() and set_working_dir() methods on Agent for testing
- Add integration test using MockProvider to simulate blocked write_file
2026-02-05 11:34:10 +11:00
Dhanji R. Prasanna
3046f0dd6e feat: Add invariants system for Plan Mode verification
Adds rulespec.yaml and envelope.yaml support for machine-readable
invariant checking during plan completion.

- Add invariants module with Rulespec, ActionEnvelope, and evaluation logic
- Add Invariants section to system prompt with workflow instructions
- Show rulespec/envelope file status in plan verification output
- Rulespec written during planning (captures constraints from task)
- Envelope written after implementation (documents what was built)
2026-02-04 20:49:58 +11:00
Dhanji R. Prasanna
263a838d31 Remove redundant 'No plan exists' message from plan_read output
The UI already shows 'empty' via print_plan_compact, so returning an
empty string avoids duplicate output.
2026-02-02 17:19:01 +11:00
Dhanji R. Prasanna
e332109273 Auto-approve plans in non-interactive (autonomous/one-shot) mode
- Add auto-approval logic in execute_plan_write() when ctx.is_autonomous is true
- Update system prompt to document auto-approval behavior
- Plans still require explicit approval in interactive mode
2026-02-02 17:16:21 +11:00
Dhanji R. Prasanna
571188305a feat: add compact UI output for Plan Mode tools
Plan tools (plan_read, plan_write) now display with elegant tree-style
formatting similar to the old todo_write UI:

- State indicators: □ (todo), ◐ (doing), ■ (done), ⊘ (blocked)
- Tree prefixes (├/└) for items with child details
- Strikethrough for completed items
- Shows touches and all three checks (happy/negative/boundary)
- Displays plan file path link at the end

plan_approve uses compact single-line format like read_file:
- Shows approval status and revision number
- Handles already-approved and error cases

Changes:
- Add print_plan_compact() to UiWriter trait with default impl
- Implement print_plan_compact() in ConsoleUiWriter
- Call print_plan_compact() from execute_plan_read/write
- Add plan_read/plan_write to is_self_handled_tool()
- Add plan_approve to is_compact_tool() with format_plan_approve_summary()
- Add serde_yaml dependency to g3-cli
2026-02-02 15:30:05 +11:00
Dhanji R. Prasanna
d6b7177107 Implement plan_verify() for deterministic evidence validation
Adds a verification system that checks evidence in completed plan items:

- Evidence parsing: supports code locations (file:line, file:line-line, file only)
  and test references (file::test_name)
- Code location verification: checks file exists, validates line numbers in range
- Test reference verification: checks test file exists, searches for fn pattern
- Verification results: Verified, Warning, Error, Skipped statuses
- Loud output formatting with emoji indicators for warnings/errors
- Integration with execute_plan_write(): runs when plan is complete and approved
- 12 new unit tests covering parsing and verification

Warnings are advisory (don't block), errors are loud but also don't block.
Blocked items are skipped during verification.
2026-02-02 15:15:03 +11:00
Dhanji R. Prasanna
a63950d8f5 Add Plan Mode to replace TODO system
Plan Mode is a cognitive forcing system that requires reasoning about:
- Happy path
- Negative case
- Boundary condition

New tools:
- plan_read: Read current plan for session
- plan_write: Create/update plan with YAML content (validates structure)
- plan_approve: Mark current revision as approved

New command:
- /feature <description>: Start Plan Mode for a new feature

Plan schema requires:
- plan_id, revision, approved_revision
- items with id, description, state, touches, checks (happy/negative/boundary)
- evidence and notes required when marking items done

Verification:
- plan_verify() called automatically when all items are done/blocked

Removed:
- todo_read, todo_write tools
- todo.rs module and related tests
2026-02-02 14:38:25 +11:00