ADR-001: Compounding Knowledge Engine (CKE)
Date: 2026-04-08 Status: Accepted (Internal R&D) Deciders: Jason
Context
OMEGA has 1322 memories with sophisticated retrieval (RRF, Thompson sampling, decay, feedback). But knowledge doesn't actively compound: storing a new memory that confirms an existing thesis doesn't strengthen the thesis. Contradictions are detected but don't trigger re-evaluation of related beliefs. There's no scheduled audit that identifies knowledge gaps.
Karpathy's autoresearch (modify-eval-keep/discard) and LLM Wiki (ingest-ripple-lint) provide the pattern. OMEGA already has the primitives. The gap is the orchestration layer that connects them into a compounding loop.
Decision
Build a lightweight CompoundingEngine class in src/omega_platform/compounding.py that orchestrates existing primitives into a feedback loop. No new database tables. No schema changes. Uses existing metadata fields, edges, and feedback mechanisms.
Three Operations
1. RIPPLE (post-store enrichment) When a new memory is stored, automatically:
- Find top-5 semantically similar memories (already done by
_auto_relate) - If similarity > 0.80 AND same entity/project: increment
evidence_countin related memory metadata - If contradiction detected: create
contradictsedge, mark old as superseded - If same thesis: strengthen (increment access_count on related memories)
- Log ripple effects to a
ripple_logmetadata field on the new memory
Trigger: Hook into _schedule_auto_relate() or post-store pipeline.
2. LINT (periodic knowledge audit) Scheduled pass (daily or on-demand) that scores the knowledge base:
- Orphan memories: 0 edges, 0 access, >30 days old. Flag for review.
- Stale theses: Decisions/lessons not accessed in 60 days. Flag as potentially outdated.
- Contradiction clusters: Groups of memories with
contradictsedges. Surface for resolution. - Coverage gaps: Entity IDs with <3 memories. Projects with no recent decisions.
- Prediction accuracy: Oracle calibration rollup. Which domains are we miscalibrated on?
- Strength distribution: How many memories at strength <0.1? Knowledge is decaying faster than compounding?
Output: A lint_report memory (event_type: advisor_insight) summarizing findings.
3. THESIS TRACKING (evolving beliefs)
New metadata pattern (not a new type, uses existing decision type):
{
"event_type": "decision",
"metadata": {
"thesis": true,
"thesis_id": "thesis-ai-healthcare-2026",
"confidence": 0.7,
"evidence_count": 5,
"evidence_for": ["mem-abc", "mem-def"],
"evidence_against": ["mem-ghi"],
"last_evaluated": "2026-04-08",
"domain": "ai/healthcare"
}
}
Ripple operation updates confidence based on evidence ratio:
confidence = evidence_for / (evidence_for + evidence_against + 1)
Architecture
Store Memory
|
v
[Existing] _auto_relate() -> creates edges
|
v
[NEW] ripple() -> strengthen related, detect contradictions,
update thesis confidence, log effects
|
v
[Existing] Strength computed on next query (automatic)
[Scheduled]
|
v
lint() -> scan for orphans, stale, gaps,
contradictions, calibration
|
v
Store lint_report as advisor_insight
|
v
[Optional] Generate research tasks from gaps
What We DON'T Build
- No new database tables
- No new MCP tools (yet)
- No admin UI (yet)
- No changes to query pipeline
- No scheduled cron (manual or simple timer for now)
Consequences
Positive
- Knowledge compounds automatically after every store
- Lint catches decay before it becomes invisible
- Thesis tracking gives explicit confidence levels on beliefs
- Uses existing primitives, minimal new code (~200 lines)
- VC case study demonstrates value concretely
Negative
- Ripple adds ~100ms to store operations (background thread, acceptable)
- Lint on 1322 memories takes ~5-10 seconds
- Thesis confidence is simplistic (evidence ratio, not Bayesian)
Neutral
- Internal R&D only, can iterate freely without API stability concerns
- Seed data for VC case study exercises all three operations
Alternatives Considered
A: Instruction-only (program.md approach)
No code, just agent instructions. Rejected because it doesn't compound across sessions and requires the agent to remember to lint.
C: Full engine with new schema
New theses table, lint_results table, scheduled cron. Rejected as premature for R&D. Can upgrade later if the pattern proves valuable.