Comparison·March 4, 2026·8 min read

QMD vs OMEGA:
Why Search Is Not Memory.

Name: OMEGA
Author: OMEGA

QMD is the best local search engine for your markdown files. It cuts token usage by 95%. But finding the right document is only the first step.

Abstract visualization of search magnifier versus neural memory network

Tobi Lütke built QMD to solve a real problem: AI agents waste thousands of tokens injecting entire files into context when they only need a few relevant paragraphs. His solution is a local hybrid search engine that combines BM25 keyword matching, vector semantic search, and LLM reranking. It runs entirely on your machine, costs nothing after the initial model download, and returns results in under 100ms.

I built OMEGA, so I have a stake in this comparison. I will be transparent about where QMD excels and where the two tools solve fundamentally different problems. The short version: QMD is a search engine. OMEGA is a memory system. They overlap on retrieval, but memory requires capabilities that search cannot provide.

The Memory Ladder

Most conversations about "AI memory" conflate three distinct capabilities. I think of them as rungs on a ladder. Each one builds on the previous, and each one solves a different category of problem.

Rung 1·Most agents today

Context Injection

Dump the entire file into the prompt. No infrastructure needed. Works until you hit context limits or your token bill explodes.

Rung 2·QMD

Smart Retrieval

Hybrid search finds the right chunks instead of injecting everything. BM25 for keywords, vectors for meaning, LLM reranking for precision. QMD does this and does it well.

Rung 3·OMEGA

True Memory

Goes beyond finding documents. Stores knowledge, tracks how it changes over time, detects contradictions, maps relationships between entities, and surfaces relevant context you did not ask for.

QMD is a genuine leap from Rung 1 to Rung 2. Its seven-stage retrieval pipeline is well-engineered: query expansion generates two LLM-generated variations of your search, both FTS and vector indexes run in parallel, Reciprocal Rank Fusion merges the results, and a local Qwen3 reranker assigns final relevance scores. The markdown-aware chunking preserves semantic units like code blocks and headings. All of this runs locally with no API costs.

But Rung 2 is the ceiling, not the destination. And most developers building with AI agents will hit that ceiling faster than they expect.

Five Things Search Cannot Do

These are not edge cases. They are routine situations that any long-running AI agent encounters within its first week of use.

1. Detect contradictions. You told your agent in January that you prefer TypeScript for new projects. In March, you switched to Rust. A search engine returns both documents. It has no way to know that the March preference supersedes the January one. OMEGA detects contradictions at store time using cross-encoder models. When a new memory conflicts with an existing one, the old memory is automatically superseded and the new one takes priority.

2. Track how decisions evolve. Your team decided to use Redis for caching in Week 1. In Week 3, you switched to Valkey. In Week 5, you dropped the cache entirely. Search returns whichever document scores highest for the query "caching strategy." It might be the Week 1 decision. It might be Week 5. There is no way to know. OMEGA maintains a temporal chain where each decision links to its predecessor, so queries return the most recent state by default.

3. Map relationships between entities. Your agent knows you work on Project X. It also knows Project X uses PostgreSQL. But it cannot infer that you likely have PostgreSQL experience unless you explicitly told it. OMEGA extracts entities automatically and builds a relationship graph. When you mention a project, OMEGA can surface related technologies, people, and decisions without a direct query.

4. Reason about time. "What did we decide about the auth system last week?" is a temporal query. It requires knowing when memories were created, not just what they contain. QMD indexes document content but not temporal metadata. OMEGA uses a bi-temporal model that tracks both when something happened and when it was recorded, enabling queries scoped to any time range.

5. Surface memories you did not search for. This is the one that surprises people most. You open a file to fix a bug. Before you type a query, OMEGA notices that three weeks ago your agent stored a debugging insight about that exact file. The insight appears in your context automatically. OMEGA hooks into session start, file edit, and planning events to inject relevant memories proactively. Search only answers questions you think to ask. Memory anticipates what you need.

The core distinction

QMD answers "which document matches this query?"

OMEGA answers a different question entirely: "what does my agent know, how has that knowledge changed over the past month, and what context should it have right now even though nobody asked?" That gap is the difference between a search engine and a memory system.

Head-to-Head: 15 Dimensions

QMD data is sourced from the QMD README. OMEGA data is from our documentation and published benchmarks.

Dimension	OMEGA	QMD
Primary purpose	Persistent memory for AI agents	Local document search engine
Local-first	Yes (SQLite, zero cloud required)	Yes (SQLite, local GGUF models)
MCP server	Yes (14 tools)	Yes
BM25 full-text search	Yes (FTS5)	Yes (FTS5)
Vector semantic search	Yes (ONNX embeddings)	Yes (local GGUF embeddings)
LLM reranking	Cross-encoder reranking	Qwen3-Reranker (local)
Contradiction detection	Yes (auto-supersession)	No
Memory evolution	Yes (strength decay, temporal tracking)	No
Entity extraction	Yes (auto-extracted, relationship graph)	No
Cross-session relationships	Yes (auto-relate, entity graph)	No
Temporal reasoning	Yes (bi-temporal model)	No
Proactive surfacing	Yes (session start, pre-edit hooks)	No
Cloud sync	Yes (optional, Supabase)	No
Multi-agent coordination	Yes (file claims, task queues, messaging)	No
LongMemEval benchmark	#1 (95.4%)	Not tested

Where QMD Excels

QMD's retrieval pipeline is more sophisticated than OMEGA's search layer in several ways. The seven-stage hybrid pipeline with query expansion, parallel dual-index retrieval, Reciprocal Rank Fusion, and position-aware LLM reranking is a production-grade retrieval system. OMEGA's search combines FTS5 and vector similarity with cross-encoder reranking, but it does not do query expansion or multi-query fusion.

QMD's markdown-aware chunking is smart. It scores potential split locations (headings, code block boundaries, list items) and picks the highest-scoring break within a window. Code blocks are kept whole. This preserves semantic coherence in ways that naive character-count splitting cannot.

The model stack is fully local and the total download is about 1.9GB: EmbeddingGemma-300M for embeddings, Qwen3-Reranker-0.6B for reranking, and a fine-tuned 1.7B model for query expansion. After the first download, there are zero ongoing costs. If your primary need is finding relevant sections of markdown documentation quickly and cheaply, QMD is hard to beat.

When to Use What

QMD if...

✓You need fast local search over markdown docs, meeting notes, or knowledge bases
✓Your primary problem is token cost from injecting full files into context
✓You want zero ongoing costs after the initial 1.9GB model download
✓You use OpenClaw and want a drop-in memory backend upgrade

OMEGA if...

✓Your agent needs to learn from interactions and track evolving decisions across sessions
✓You need contradiction detection so old facts do not poison new responses
✓You work across multiple projects and want cross-project recall
✓You run multi-agent workflows that need coordination (file claims, task queues, messaging)
✓You want benchmark-proven accuracy (95.4% on LongMemEval, #1 overall)
✓You use Claude Code, Cursor, Windsurf, or any MCP client

The two tools are not competing for the same job. QMD is a retrieval layer. OMEGA is a memory layer. You could use QMD to search your project documentation and OMEGA to store what your agent learns from working on that project. They solve different problems at different rungs of the ladder.

Get started

Two commands. Zero cloud. Full memory.

pip3 install omega-memory[server]

omega setup

Works with Claude Code, Cursor, Windsurf, Zed, and any MCP client. Local-first. No API keys. No cloud. Full quickstart guide.

Quick Start Docs Benchmarks Compare All Posts