Building Karpathy's
LLM Knowledge Base
with OMEGA and Obsidian
Andrej Karpathy recently proposed an LLM-maintained personal knowledge base with Obsidian as the frontend. The idea is compelling: an LLM reads your raw notes, compiles them into a structured wiki, and maintains it over time. But the pattern has a scaling problem. Retrieval depends on the LLM reading an index.md file, which breaks once your knowledge base grows past a few hundred pages. OMEGA fills this gap with semantic search, persistent memory, and contradiction detection, all running locally inside Obsidian.
What Karpathy Proposed
On April 3, 2026, Karpathy published a gist describing what he called an "idea file" for LLM-maintained personal knowledge bases. The architecture has three layers:
The frontend is Obsidian, which makes the wiki browsable and editable by humans. The LLM acts as a compiler: it takes unstructured input and produces structured output. Karpathy envisions using Claude Code or a similar agent as that compiler, running periodically to update the wiki as new raw material arrives.
It is a clean design. The separation between raw material, compiled knowledge, and schema gives both humans and LLMs clear roles. But there is a bottleneck hiding in the query step.
The Scaling Gap
When you ask the knowledge base a question, the LLM needs to find the relevant wiki pages. In Karpathy's pattern, it does this by reading index.md, which lists every page and a short description. The LLM scans the index, identifies which pages are relevant, then reads those pages.
At 50 pages, this works well. At 100 pages, the index is getting long but still fits in context. At 1,000 pages, the index alone might be 50,000+ tokens. At 10,000 pages, you have blown past every model's context window just on the table of contents.
Karpathy acknowledges this. He mentions bolting on qmd (a semantic search tool) as a separate step. But that introduces a second system with its own index, its own query interface, and no integration with the wiki's structure. The LLM has to coordinate between the index, the search tool, and the wiki pages manually.
There are two deeper issues beyond raw scale. First, the LLM starts from zero every session. It has no memory of which pages it compiled last time, what questions were asked before, or what contradictions it found. Second, there is no mechanism for detecting when two wiki pages contain conflicting information. As the wiki grows, inconsistencies accumulate silently.
How OMEGA Fills the Gap
OMEGA is a persistent memory engine that runs locally. It replaces the "read index.md" step with semantic search, and it gives the LLM session memory so it does not start from zero each time. Here is what changes:
Quick Setup: OMEGA + Obsidian + Claude Code
Step 1: Install OMEGA
$ pip install omega-memory $ omega setup
This installs the OMEGA package from PyPI and initializes the local SQLite database at ~/.omega/. Includes the ONNX embedding model (~33MB), semantic search engine, and all MCP tools. Python 3.11+ required.
Step 2: Install the Obsidian Plugin
Install the OMEGA Memory Obsidian plugin via BRAT. Open Obsidian Settings, go to Community Plugins, install BRAT, then add the repository:
omega-memory/omega-obsidian-plugin
Once installed, the plugin gives you a semantic search command palette action and a sidebar panel for browsing OMEGA memories. Every page in your vault becomes searchable by meaning, not just filename or tag.
Step 3: Create the Vault Structure
Follow Karpathy's three-layer pattern in your Obsidian vault:
my-knowledge-base/
├── raw/ # Bookmarks, highlights, PDFs, voice notes
│ ├── articles/
│ ├── highlights/
│ └── notes/
├── wiki/ # LLM-compiled pages (structured knowledge)
│ ├── transformers.md
│ ├── attention-mechanisms.md
│ └── ...
└── schema/
└── index.md # Optional — OMEGA replaces this for retrievalThe raw/ folder holds everything you collect. The wiki/ folder holds LLM-compiled pages. You can still keep index.md for human navigation, but OMEGA handles retrieval independently via semantic search.
Step 4: Use Claude Code as the Compiler
Add OMEGA as an MCP server in your Claude Code configuration:
{
"mcpServers": {
"omega": {
"command": "python3",
"args": ["-m", "omega", "serve"]
}
}
}Now point Claude Code at your vault. It reads raw sources, compiles wiki pages, and OMEGA indexes everything automatically. When you ask a question, OMEGA retrieves the relevant pages by semantic similarity instead of scanning an index file. The LLM reads only what it needs.
Why This Is Not RAG
Standard RAG (retrieval-augmented generation) is stateless. You chunk documents, embed them, and retrieve relevant chunks at query time. Every query starts fresh. Nothing accumulates.
Karpathy's pattern is fundamentally different because it has a compile step. The LLM does not just retrieve raw text. It synthesizes new artifacts: wiki pages that represent the LLM's understanding of a topic. These artifacts are permanent. They get updated, refined, and linked over time.
OMEGA aligns with this philosophy. It is not a per-query retrieval system. It is persistent memory. Knowledge accumulates across sessions. The LLM remembers which pages it compiled, what contradictions it found, what the user asked about most. Each session builds on the last.
The combination is powerful: Karpathy's compile step creates permanent artifacts in the wiki. OMEGA's persistent memory creates permanent context for the compiler. RAG gives you search. This gives you a knowledge base that grows smarter over time.
| Capability | Karpathy (index.md) | Karpathy + OMEGA |
|---|---|---|
| Retrieval method | LLM reads index.md | Semantic search (95.4% accuracy) |
| Scale limit | ~100-200 pages | 10,000+ pages |
| Session memory | None (starts fresh) | Persistent across sessions |
| Contradiction detection | Manual | Automatic |
| Obsidian integration | File browsing only | Semantic search + browsing |
| Dependencies | LLM + Obsidian | LLM + Obsidian + OMEGA (local) |
Frequently Asked Questions
What is Karpathy's LLM knowledge base pattern?
Andrej Karpathy proposed a system where an LLM maintains a personal knowledge base in Obsidian. Raw sources go into a folder, the LLM compiles them into structured wiki pages, and an index.md maps the structure. Humans browse in Obsidian, the LLM handles compilation and retrieval.
Why does the pattern break at scale?
Retrieval depends on the LLM reading index.md to find relevant pages. At around 100 pages, this works. At 1,000+ pages, the index overflows the context window. There is no semantic search built in, so the LLM cannot find relevant pages without reading the entire index.
How does OMEGA solve the retrieval problem?
OMEGA replaces the index.md lookup with semantic search. Every page is embedded locally using bge-small-en-v1.5 (ONNX). When you ask a question, OMEGA returns the most relevant pages by meaning, not by scanning a file. This scales to tens of thousands of pages.
Do I need an API key or cloud account?
No. OMEGA runs entirely locally using SQLite for storage and ONNX for embeddings. The Obsidian plugin connects to the local OMEGA instance. Everything stays on your machine.
Can I use this with Claude Code as the compiler?
Yes. Add OMEGA as an MCP server in your Claude Code configuration. Claude Code reads your raw sources, compiles wiki pages, and OMEGA handles indexing and retrieval. The same memory persists across Claude Code sessions, so the compiler remembers its prior work.
Related reading
OMEGA is free, local-first, and Apache 2.0 licensed.