Name: OMEGA
Author: OMEGA

OMEGA Pro is the local memory and coordination layer for AI agents. The job is to give every agent on your machine a durable graph it can read, write, and reason against, with no data leaving the device unless you say so. The memory layer on a device decides what the agents running on top of it can become. This quarter shipped nine capabilities. Five belong to the Sovereign Frontier Memory Stack we covered in Memory + Dreaming + Audit, On Your Machine. The other four landed after on the platform layer: semantic depth, observability, and multi-agent ergonomics. Ranked below by impact.

1. The tool surface now tunes to the deployment

The right MCP surface depends on what the agent is doing, and OMEGA_MODE now picks it at handshake. Solo is a four-call core (store, recall, evolve, forget) for a single agent on one machine. Multi-agent adds the coordination primitives (file claims, branch guards, peer messaging, task queues, deadlock detection) for workflows where more than one agent shares the store, without paying for surface the agent will never call. Enterprise opens the full Pro surface, 115+ tools across eight modules covering memory, coordination, knowledge graph, entity resolution, and routing.

The Solo profile alone reclaims roughly 13,000 tokens of context budget per session compared to the full surface. Combined with the sub-5 ms recall path, the model spends its context budget on reasoning instead of tax on a tool surface it doesn't use.

2. Different data deserves different forgetting

Conversational turns, health summaries, calendar context, and short-lived secrets should not share a retention policy. Typed memory now ships with Pydantic-validated schemas, per-type retention windows, and per-type access scopes. Long-lived summaries persist for months. Conversational turns expire on a tighter schedule. Short-lived secrets die on first use. Retention is declared at the schema, validated at the boundary, and enforced by the engine. Setting extra="forbid" at the type boundary rejects unknown fields loudly instead of letting them silently enter the graph. The schema does the forgetting, not a trust relationship with the agent.

3. Privacy now has a signature, not a promise

Every write, edit, and forget in the store is signed with a user-controlled key, content-hashed for tamper-evidence, and chained to the previous event. Modifying the log afterward breaks the signature and is detectable. A redact path lets the user expunge specific memories while keeping the chain verifiable for everything that remains. The practical effect: the audit record can be handed to a regulator, an internal auditor, or the user themselves, and the integrity of what the agent claims to remember is verifiable end to end. This is privacy enforced by architecture rather than promised in a settings screen, and unlike cloud-hosted memory systems that require data to leave the device by design, none of it has to.

4. Memory follows the user across devices

Two paired OMEGA instances now exchange typed manifests over local transport. Each side declares what it shares, the peer pulls only the slice it needs, and the canonical memory lives wherever the user wants it. Other devices read on demand instead of mirroring. No cloud broker, no vendor sync account, any peer revocable at any time. For anyone whose work crosses a phone, a tablet, a watch, and a laptop, this is the difference between an assistant tied to one screen and an assistant that remembers what the user was just doing on a different device.

5. Agents reason against causal structure

The knowledge graph ships with typed edges that carry temporal context. The schema includes causedBy for inferred causal relationships, derivedFrom for provenance chains, and a closed enum of edge types that the engine validates on write so the graph can't drift into an undefined state. A new omega_graph MCP tool exposes the graph directly to agents, with as_of queries that ask the graph what it knew at a given point in time. The result is the difference between a memory store that records facts and one that records how those facts relate over time, which is what an agent needs to answer a follow-up question without re-asking the user.

6. Multi-agent workflows are observable in real time

The admin coordination view became an interactive canvas this quarter. Subagents stack visually under their parent. A mode pill on each node shows the current OMEGA_MODE profile. Communication between agents (claims, messages, deadlock chains) renders as live edges that update as the topology changes. Nodes can be dragged, and the layout persists across data refreshes so the operator's mental map of who is doing what doesn't reset every time the underlying coordination state changes. For any team running more than one agent in parallel, this is the operator-side answer to “what is each agent actually doing right now.”

7. Memory quality compounds while the engine is idle

Scheduled analyzers walk the store during idle time, propose mutations (deduplication, supersession, link inference, contradiction surfacing), and either queue them for review or auto-apply against per-store policy. Recall quality drifts up as a function of how long the engine has been running rather than down. Operations are now revertible through a new MCP tool. Hard-deleted nodes can be restored. Dreams that produce zero operations auto-discard so the review queue stays signal-only. The whole pipeline runs on an hourly cadence with an explicit auto-apply tier for high-confidence mutations. The store maintains itself between sessions, and any auto-applied change can be reverted to the prior state.

OMEGA Dreams admin view: 29 dream cycles total across status filters (For Review 6, Pending 9, Applied 6, Discarded 14, Reverted 0). Central Dream Orb visualization renders the memory neighborhood of one pending cycle proposing 10 operations on 40 touched memories. Right sidebar shows the selected memory, proposed impact, and dream metadata including analyzer and store.

Live Dreams admin: 29 cycles total · 6 applied · 14 auto-discarded as zero-op · 9 pending review

8. The agent knows which project it's in

OMEGA Pro now binds memory to project scope. A new omega project init CLI seeds project-scoped entity tags and status at the repo root. A code-index event with a post-commit hook keeps the agent's view of the codebase up to date without explicit instruction. The cwd of the running agent scopes which memories surface first, so a coding agent in one repo doesn't get answers polluted by context from a different repo on the same machine. For anyone who runs multiple projects from the same workstation, cross-project bleed-through stops without losing the shared memory graph underneath.

9. The geometry of memory is visible

The admin now ships a 3D visualization of the memory store with a UMAP semantic layout, so the graph clusters by meaning rather than by insertion order. Pro stores render unbounded. The open core caps at 2,000 nodes for performance. Cameras, edge-type filters, an as_of scrubber, and a tier-aware bloom pass ship in the same view. Seeing what the agent knows, and the shape of how those memories cluster, surfaces bias, contradictions, and gaps a query-only interface hides.

OMEGA Memory Graph admin view: 4,087 memories rendered in 3D with UMAP semantic layout, color-coded by type (Decision, Lesson, Constraint, Checkpoint, Preference, User Fact, Observation, Behavioral, Error/Anti-Pattern, Advisor Insight, Research Report, Narrative Beat, Handoff, Session Summary)

Live admin view, 4,087 memories, UMAP semantic layout

The numbers behind the work

Two waves of shipping moved accuracy and held footprint. The second is the one we obsess over most, because a memory system that gets smarter only by getting larger eventually stops being useful on a constrained device.

Metric	Today	Source
LongMemEval accuracy	95.4% (#1)	Public leaderboard, ICLR 2025
Local package, ARM64 INT8	100 to 120 MB	Rust core + ONNX embeddings
Recall latency, NPU	< 5 ms	bge-small-en-v1.5, 384-dim
Recall latency, CPU fallback	< 20 ms	No NPU required to ship
Scale with no degradation	Tens of thousands validated	1M memories projected under 2 GB
Encryption at rest	AES-256-GCM, default on	Per-store keys, user-controlled
Model coverage	Claude, GPT, Gemini, Llama, local	Vendor-neutral by architecture
Open core downloads	29.6K	PyPI, Apache-2.0 (omega-memory)
Open source	Apache-2.0	github.com/omega-memory/omega-memory

How we know it holds up

The accuracy number is a public result on a third-party benchmark (LongMemEval, ICLR 2025), run by an independent group with a leaderboard anyone can reproduce or contest. The footprint and latency numbers come from running OMEGA against the same query shapes other systems publish, on hardware any reviewer can buy. The open-source core is Apache-2.0 on GitHub with a deep test suite (2,500+ passing tests, zero ruff warnings), so the engine that produced those results is the engine a reader can audit.

We also run on our own engine. As of this writing the store backing day-to-day OMEGA engineering work holds 4,087 memories across 26 types, with 2,017 of them written in the last 30 days alone, and shows no measurable recall degradation as it scales. We exercise the multi-agent coordination primitives daily, every time two or more agents on the team write to the store at once. The engine has been integrated with Claude Code, Cursor, Codex, and Obsidian, so the API surface has been pressure-tested against four distinct agent shapes before anyone else touches it.

Where this runs

All nine capabilities ship in OMEGA Pro v1.5.1 today on macOS, Linux, and Windows. The open-source core stays on PyPI under Apache-2.0. The port to constrained environments is active: feasibility study published, Rust core and ONNX path ARM64-ready, NPU acceleration path (QNN / SNPE) documented. If you're building agents and want to talk about the layer underneath them, reach out.

Stack deep-dive · Benchmarks · Feasibility study · GitHub