Root cause: ActionEnvelope.to_yaml_value() creates a Mapping from the
facts HashMap without a 'facts:' wrapper key, but rulespec selectors
may include a 'facts.' prefix (e.g. 'facts.feature.done' instead of
'feature.done'). This caused zero facts to be extracted, making all
predicate evaluations fail.
Fix: extract_facts() now tries the selector against the unwrapped
envelope value first, and if empty, retries against a facts-wrapped
version as fallback.
Also:
- Strengthened write_envelope tool description to require top-level
facts: key, file paths for evidence, and allow free-form notes
- Updated system prompt with matching rules
- Added 6 new tests (4 unit, 2 integration)
- Strengthened existing integration test to verify fact count > 0
- New crates/g3-core/src/tools/envelope.rs with execute_write_envelope()
and verify_envelope() (moved from shadow_datalog_verify in plan.rs)
- write_envelope accepts YAML facts, writes envelope.yaml to session dir,
then runs datalog verification against analysis/rulespec.yaml in shadow mode
- plan_verify() now only checks envelope existence (no longer runs datalog)
- Tool count: 13 -> 14
- Updated system prompt to instruct agents to call write_envelope before
marking last plan item done
- Updated integration tests to use write_envelope tool directly
Workflow: write_envelope -> verify_envelope -> datalog shadow artifacts
plan_write(done) -> plan_verify -> checks envelope exists
- Remove rulespec parameter from plan_write tool definition and execution
- Remove rulespec compilation from plan_approve (no longer pre-compiles)
- Remove write_rulespec, get_rulespec_path, format_rulespec_yaml/markdown
from invariants.rs; read_rulespec() now takes &Path working dir
- Remove save/load_compiled_rulespec, get_compiled_rulespec_path from datalog.rs
- Update shadow_datalog_verify() to compile on-the-fly from
analysis/rulespec.yaml, writing rulespec.compiled.dl and
datalog_evaluation.txt to session dir
- Remove rulespec display from plan_read output
- Remove Invariants/Rulespec section from native.md system prompt
- Remove rulespec from prompts.rs plan_write format and examples
- Update existing tests to remove rulespec from plan_write calls
- Add 3 integration tests for on-the-fly rulespec verification
Removed redundant and vague content from prompts/system/native.md:
- Simplified intro from 17 lines to 3 lines
- Reduced Code Search section to one line
- Removed duplicate Plan Mode example (kept one)
- Removed Action Envelope section (rarely used correctly)
- Removed verbose Memory Format details (tool description covers it)
- Removed Response Guidelines (obvious to modern LLMs)
Size: 8,620 chars -> 4,498 chars
Also updated:
- G3_IDENTITY_LINE constant for agent mode compatibility
- Test assertions to check for new prompt markers
- System prompt validation to use new marker string
Solves the tautology problem where the LLM would write invariants after
implementation, making them match what was done rather than constrain it.
Changes:
- plan_write now accepts 'rulespec' parameter
- New plans REQUIRE rulespec (fails with helpful error if missing)
- Plan updates don't require rulespec (backward compatible)
- Rulespec is parsed, validated, and written atomically with plan
- Updated system prompt with clear examples for new vs update
- Updated tool definition schema
- Updated all affected tests
New flow: task → plan+rulespec → user reviews BOTH → approve → implement
- Web Research instructions now come from skills/research/SKILL.md
- Skills are dynamically loaded and injected via generate_skills_prompt()
- Remove test_both_prompts_have_web_research test (no longer applicable)
- Remove unused G3Status::research_complete() function
This completes the externalization of research as a skill.
Replaces the built-in research/research_status tools with a portable
skill-based approach:
- Add embedded skills infrastructure (skills compiled into binary)
- Add repo-local skills/ directory support (highest priority)
- Create research skill with SKILL.md and g3-research shell script
- Script extraction to .g3/bin/ with version tracking
- Filesystem-based handoff via .g3/research/<id>/status.json
- Remove PendingResearchManager and all research tool code
- Update system prompt to reference skill instead of tool
Benefits:
- No special tool infrastructure needed (just shell + read_file)
- Context-efficient (reports stay on disk until needed)
- Crash-resilient (state persisted to filesystem)
- Portable (skill can be overridden per-workspace)
Breaking change: research tool calls now return a deprecation message
pointing to the research skill.
- Move system prompt for native tool calling models to prompts/system/native.md
- Use include_str! to embed at compile time
- Remove concatenated SHARED_* string constants
- Prompt is now readable/editable as a complete markdown document
- Non-native prompt still uses Rust constants (acceptable for now)