EngineeringFeb 15, 2026·7 min read

Cross-Encoder Reranking & Contradiction Detection

Two new features that make OMEGA's memory smarter: neural re-scoring of search results and automatic detection of conflicting information. Both run locally. Zero LLM calls.

The problem with flat retrieval

Most memory systems retrieve results using a single similarity score. You embed the query, compare it to stored embeddings, sort by cosine distance. Simple. Fast. And often wrong.

The issue is that separate embeddings can't model the full relationship between a query and a passage. "How do I handle authentication?" and "JWT refresh token rotation pattern" are semantically related, but their individual embeddings might not be close enough to surface the connection. Important results get buried.

The standard fix is a cross-encoder: encode the query and passage together as a pair, and let the model attend to both simultaneously. This is more expensive than comparing pre-computed embeddings, but dramatically more accurate for re-scoring a shortlist.

How OMEGA's reranking works

OMEGA's query pipeline now has a cross-encoder stage between heuristic scoring and final deduplication:

Vector + FTS5

Initial recall

Heuristic Scoring

Type weights, context

Cross-Encoder

Neural re-scoring

Dedup + Decay

Final ranking

The first stages (vector similarity, FTS5, type weighting, contextual boosting) produce a ranked list of candidates. The cross-encoder then re-scores the top 20 results by encoding each (query, passage) pair.

OMEGA uses ms-marco-MiniLM-L-6-v2, a 22 MB model that runs via ONNX Runtime on CPU. No GPU, no API key, no external calls. The model was trained on the MS MARCO passage ranking dataset, specifically designed for this task.

The cross-encoder score is applied as a multiplicative boost (1.0 + 0.3 × normalized score), so it refines the existing ranking without overriding signals from user feedback, access frequency, or memory type. If the model is unavailable, the pipeline falls back gracefully to standard retrieval.

# What the pipeline looks like in practice:
$ omega_query("auth patterns")

[vector+fts5] 47 candidates found
[heuristic]   scored and sorted
[reranking]   top 20 rescored by cross-encoder (12ms)

  1. JWT refresh token pattern         (0.94)
  2. OAuth2 PKCE flow decision          (0.87)
  3. Session cookie migration lesson    (0.81)
  4. API key rotation schedule          (0.73)
  5. CORS preflight caching note        (0.68)

22 MB

Cross-encoder model size

ONNX Runtime, CPU-only, downloads on first use

Contradiction detection

Memory systems have a dirty secret: they accumulate contradictions. You store "prefers dark mode" in January, then "switched to light mode" in March. Both memories exist. Neither knows about the other. When the agent retrieves them, it gets confused.

OMEGA now detects contradictions at write time. When you store a new memory, the system finds the 10 most similar existing memories (using the cross-encoder as a similarity gate) and runs four heuristic signal detectors:

Signal	Example	Confidence
Negation	"likes X" vs "doesn't like X"	Up to 0.8
Antonyms	"prefers dark mode" vs "prefers light mode"	Up to 0.9
Preference Change	"uses vim" vs "uses neovim"	Up to 0.8
Temporal Override	"now uses Python" vs "uses JavaScript"	Up to 0.6

Each signal returns a confidence score. If the weighted combination exceeds 0.4, OMEGA flags it as a contradiction and annotatesboth memories with bidirectional metadata: the new memory gets a contradicts field, the old one gets contradicted_by. A graph edge is also created so you can traverse the relationship.

The key design choice: no LLM calls. The entire detection system uses pure heuristic functions. This keeps it fast (sub-millisecond per pair), free (no API costs), and deterministic (same input always produces same output).

# Contradiction detection in action:
$ omega_store("switched to light mode")

[store] Memory saved (mem-a3f2...)
[contradiction] Conflicts detected:
  "prefers dark mode" (Jan 12) — confidence: 0.82
  Signals: antonym (dark/light), temporal ("switched to")
  Both memories annotated. Graph edge created.

What this means in practice

These features work together. Cross-encoder reranking improves the quality of what gets surfaced. Contradiction detection keeps the knowledge base clean over time. Neither requires configuration, API keys, or a GPU.

For developers using OMEGA: you don't need to do anything. Both features are enabled by default. The cross-encoder model downloads automatically on first query (~22 MB). Contradiction detection runs on every store operation. If you want to disable reranking (for benchmarks, say), set OMEGA_CROSS_ENCODER=0.

LLM calls required

Pure heuristics for contradictions, ONNX for reranking

Technical details

A few implementation notes for those who want to dig deeper:

Reranker circuit breaker. If the model fails to load three times, OMEGA stops trying for 5 minutes. This prevents repeated download attempts on flaky connections from slowing down queries.

Contradiction similarity gate. OMEGA only runs contradiction checks on memories with a normalized similarity score above 0.3. This filters out obvious non-matches early, keeping the heuristic checks fast even with large memory stores.

Antonym dictionary. The antonym detector uses 20+ bidirectional pairs (light/dark, enable/disable, allow/deny, etc.) with normalized text comparison. It's intentionally conservative. False negatives are better than false positives when flagging contradictions.

Non-blocking store path. Contradiction detection runs after the memory is stored and embedded, outside the main lock. Store operations don't get slower.