Why does Karpathy's knowledge base pattern break at scale?

Karpathy's query step relies on the LLM reading index.md to find relevant pages. At around 100 pages this works fine. At 1,000+ pages, index.md overflows the LLM's context window. There is no built-in semantic search, so retrieval degrades as the knowledge base grows.

How does OMEGA improve Karpathy's LLM knowledge base?

OMEGA adds semantic search with 95.4% LongMemEval accuracy, replacing the index.md lookup. The OMEGA Obsidian plugin gives semantic search inside Obsidian. Persistent memory means the LLM retains context across sessions. Contradiction detection catches conflicting information across wiki pages.

Can I use OMEGA with Obsidian for a personal knowledge base?

Yes. Install OMEGA with pip install omega-memory, then add the OMEGA Memory Obsidian plugin via BRAT from omega-memory/omega-obsidian-plugin. OMEGA handles semantic retrieval while Obsidian handles browsing and editing. Claude Code acts as the compiler between raw sources and wiki pages.

← Blog/Architecture

Building Karpathy's
LLM Knowledge Base
with OMEGA and Obsidian

Name: OMEGA
Author: OMEGA

April 6, 2026|OMEGA team|8 min read

Andrej Karpathy recently proposed an LLM-maintained personal knowledge base with Obsidian as the frontend. The idea is compelling: an LLM reads your raw notes, compiles them into a structured wiki, and maintains it over time. But the pattern has a scaling problem. Retrieval depends on the LLM reading an index.md file, which breaks once your knowledge base grows past a few hundred pages. OMEGA fills this gap with semantic search, persistent memory, and contradiction detection, all running locally inside Obsidian.

What Karpathy Proposed

On April 3, 2026, Karpathy published a gist describing what he called an "idea file" for LLM-maintained personal knowledge bases. The architecture has three layers:

Raw Sources

Bookmarks, highlights, voice memos, notes, PDFs. Everything you collect, dumped into a raw/ folder. No organization required.

Compiled Wiki

The LLM reads raw material and synthesizes it into structured wiki pages. Each page covers one topic, written in the LLM's own words, with citations back to sources.

Schema

An index.md file that maps the structure. The LLM reads this to understand what pages exist, then navigates to relevant ones when you ask a question.

The frontend is Obsidian, which makes the wiki browsable and editable by humans. The LLM acts as a compiler: it takes unstructured input and produces structured output. Karpathy envisions using Claude Code or a similar agent as that compiler, running periodically to update the wiki as new raw material arrives.

It is a clean design. The separation between raw material, compiled knowledge, and schema gives both humans and LLMs clear roles. But there is a bottleneck hiding in the query step.

The Scaling Gap

When you ask the knowledge base a question, the LLM needs to find the relevant wiki pages. In Karpathy's pattern, it does this by reading index.md, which lists every page and a short description. The LLM scans the index, identifies which pages are relevant, then reads those pages.

At 50 pages, this works well. At 100 pages, the index is getting long but still fits in context. At 1,000 pages, the index alone might be 50,000+ tokens. At 10,000 pages, you have blown past every model's context window just on the table of contents.

Karpathy acknowledges this. He mentions bolting on qmd (a semantic search tool) as a separate step. But that introduces a second system with its own index, its own query interface, and no integration with the wiki's structure. The LLM has to coordinate between the index, the search tool, and the wiki pages manually.

There are two deeper issues beyond raw scale. First, the LLM starts from zero every session. It has no memory of which pages it compiled last time, what questions were asked before, or what contradictions it found. Second, there is no mechanism for detecting when two wiki pages contain conflicting information. As the wiki grows, inconsistencies accumulate silently.

How OMEGA Fills the Gap

OMEGA is a persistent memory engine that runs locally. It replaces the "read index.md" step with semantic search, and it gives the LLM session memory so it does not start from zero each time. Here is what changes:

Semantic Search at Scale

OMEGA embeds every memory with bge-small-en-v1.5 (ONNX, local). Instead of scanning an index file, the LLM queries by meaning. "What do I know about transformer architectures?" finds relevant pages even if they never use that exact phrase. 95.4% accuracy on LongMemEval.

Obsidian Plugin

The OMEGA Obsidian plugin brings semantic search directly into your vault. Search your wiki by meaning, not just keywords. Browse results in Obsidian's native interface, with links back to the original pages.

Persistent Session Memory

OMEGA remembers what the LLM did last session: which pages were compiled, what questions were asked, what decisions were made. The compiler picks up where it left off instead of re-reading everything.

Contradiction Detection

When the LLM compiles a new wiki page that conflicts with an existing one, OMEGA flags the contradiction. No more silently accumulating inconsistencies across hundreds of pages.

Works with Claude Code

Claude Code is the natural choice for Karpathy's "compiler" role. OMEGA runs as its MCP server, giving it persistent memory, semantic search, and coordination tools across sessions.

Knowledge Graph

Memories link to each other with typed relationships: related, supersedes, contradicts. The LLM can trace how a topic evolved over time, not just retrieve the latest version.

Quick Setup: OMEGA + Obsidian + Claude Code

Step 1: Install OMEGA

terminal

$ pip install omega-memory
$ omega setup

This installs the OMEGA package from PyPI and initializes the local SQLite database at ~/.omega/. Includes the ONNX embedding model (~33MB), semantic search engine, and all MCP tools. Python 3.11+ required.

Step 2: Install the Obsidian Plugin

Install the OMEGA Memory Obsidian plugin via BRAT. Open Obsidian Settings, go to Community Plugins, install BRAT, then add the repository:

BRAT plugin URL

omega-memory/omega-obsidian-plugin

Once installed, the plugin gives you a semantic search command palette action and a sidebar panel for browsing OMEGA memories. Every page in your vault becomes searchable by meaning, not just filename or tag.

Step 3: Create the Vault Structure

Follow Karpathy's three-layer pattern in your Obsidian vault:

vault structure

my-knowledge-base/
├── raw/              # Bookmarks, highlights, PDFs, voice notes
│   ├── articles/
│   ├── highlights/
│   └── notes/
├── wiki/             # LLM-compiled pages (structured knowledge)
│   ├── transformers.md
│   ├── attention-mechanisms.md
│   └── ...
└── schema/
    └── index.md      # Optional — OMEGA replaces this for retrieval

The raw/ folder holds everything you collect. The wiki/ folder holds LLM-compiled pages. You can still keep index.md for human navigation, but OMEGA handles retrieval independently via semantic search.

Step 4: Use Claude Code as the Compiler

Add OMEGA as an MCP server in your Claude Code configuration:

settings.json

{
  "mcpServers": {
    "omega": {
      "command": "python3",
      "args": ["-m", "omega", "serve"]
    }
  }
}

Now point Claude Code at your vault. It reads raw sources, compiles wiki pages, and OMEGA indexes everything automatically. When you ask a question, OMEGA retrieves the relevant pages by semantic similarity instead of scanning an index file. The LLM reads only what it needs.

Why This Is Not RAG

Standard RAG (retrieval-augmented generation) is stateless. You chunk documents, embed them, and retrieve relevant chunks at query time. Every query starts fresh. Nothing accumulates.

Karpathy's pattern is fundamentally different because it has a compile step. The LLM does not just retrieve raw text. It synthesizes new artifacts: wiki pages that represent the LLM's understanding of a topic. These artifacts are permanent. They get updated, refined, and linked over time.

OMEGA aligns with this philosophy. It is not a per-query retrieval system. It is persistent memory. Knowledge accumulates across sessions. The LLM remembers which pages it compiled, what contradictions it found, what the user asked about most. Each session builds on the last.

The combination is powerful: Karpathy's compile step creates permanent artifacts in the wiki. OMEGA's persistent memory creates permanent context for the compiler. RAG gives you search. This gives you a knowledge base that grows smarter over time.

Capability	Karpathy (index.md)	Karpathy + OMEGA
Retrieval method	LLM reads index.md	Semantic search (95.4% accuracy)
Scale limit	~100-200 pages	10,000+ pages
Session memory	None (starts fresh)	Persistent across sessions
Contradiction detection	Manual	Automatic
Obsidian integration	File browsing only	Semantic search + browsing
Dependencies	LLM + Obsidian	LLM + Obsidian + OMEGA (local)

95.4%

LongMemEval score

10K+

Pages supported

API keys required

Local

All data stays on device

Frequently Asked Questions

What is Karpathy's LLM knowledge base pattern?

Andrej Karpathy proposed a system where an LLM maintains a personal knowledge base in Obsidian. Raw sources go into a folder, the LLM compiles them into structured wiki pages, and an index.md maps the structure. Humans browse in Obsidian, the LLM handles compilation and retrieval.

Why does the pattern break at scale?

Retrieval depends on the LLM reading index.md to find relevant pages. At around 100 pages, this works. At 1,000+ pages, the index overflows the context window. There is no semantic search built in, so the LLM cannot find relevant pages without reading the entire index.

How does OMEGA solve the retrieval problem?

OMEGA replaces the index.md lookup with semantic search. Every page is embedded locally using bge-small-en-v1.5 (ONNX). When you ask a question, OMEGA returns the most relevant pages by meaning, not by scanning a file. This scales to tens of thousands of pages.

Do I need an API key or cloud account?

No. OMEGA runs entirely locally using SQLite for storage and ONNX for embeddings. The Obsidian plugin connects to the local OMEGA instance. Everything stays on your machine.

Can I use this with Claude Code as the compiler?

Yes. Add OMEGA as an MCP server in your Claude Code configuration. Claude Code reads your raw sources, compiles wiki pages, and OMEGA handles indexing and retrieval. The same memory persists across Claude Code sessions, so the compiler remembers its prior work.

Make Karpathy's pattern scale.

Semantic search, persistent memory, contradiction detection. All local, all inside Obsidian.

pip install omega-memory