Andrej Karpathy's LLM Wiki pattern
Karpathy Wiki
A focused guide to the idea of an LLM-maintained wiki: a persistent knowledge base that grows as the model ingests sources, rewrites pages, and compounds useful answers.
Persistent artifact
The wiki is compiled once, then revised as new sources arrive.
Human-directed
People source the material and steer the analysis; the model does the maintenance.
Source-first
Raw documents stay untouched while the wiki becomes the working layer.

The key distinction
RAG retrieves. Karpathy Wiki accumulates.
The important shift is not just better retrieval. It is moving the synthesis work into a maintained artifact that the model keeps improving over time.
| Dimension | Classic RAG | Karpathy Wiki pattern |
|---|---|---|
| Knowledge state | Mostly rediscovered at query time from raw chunks. | Persisted as linked markdown pages that keep evolving. |
| Cross-source synthesis | Repeated on every question. | Written into the wiki once, then refined over time. |
| Contradictions | Easy to miss unless the current prompt surfaces them. | Tracked directly on the relevant pages and revisited during maintenance. |
| Answer reuse | Good chats often disappear into history. | Useful answers can be filed back into the wiki as new pages. |
| Maintenance burden | Humans still need to organize and reconcile the material manually. | The model handles the bookkeeping across many pages in one pass. |

Operating model
Three layers. Three recurring operations.
Karpathy's pattern stays practical by keeping the stack small: immutable sources, a maintained wiki, and a schema file that teaches the agent how to behave.
Raw sources
Articles, notes, images, papers, transcripts, and other source-of-truth material the model reads but does not edit.
The wiki
Linked markdown pages that the model creates, revises, and cross-references as new material arrives.
The schema
An instruction file that defines structure, workflows, naming rules, and maintenance habits for the agent.
Ingest
Read a new source, summarize it, update the relevant pages, and record the change in the log.
Query
Answer from the wiki first, then file valuable answers back into the knowledge base.
Lint
Look for stale claims, weak links, orphan pages, and open research questions that deserve another pass.
Related pages
Read the pattern from four useful angles
Each page reframes the gist for a different search intent so the site works as both a guide and a compact wiki.
Core idea
LLM Wiki Guide
A clear guide to the Karpathy Wiki pattern: what an LLM wiki is, what it owns, and why it compounds better than ad hoc note piles.
Read page
Comparison
Persistent Wiki vs RAG
Compare classic RAG with the Karpathy Wiki pattern and see why persistent synthesis changes the economics of long-running knowledge work.
Read page
Workflow
Obsidian LLM Wiki Workflow
A practical workflow for running an LLM wiki in Obsidian: raw sources, maintained wiki pages, schema rules, ingest passes, query loops, and lint cycles.
Read page
Tooling
LLM Wiki Tooling
The tools that make an LLM wiki durable in practice: markdown editors, schema prompts, link maps, search, and maintenance loops.
Read page
FAQ
Common questions about the pattern
What is Karpathy Wiki in practice?
It is a pattern where an LLM maintains a markdown wiki between you and your raw source files. The human curates sources and asks questions; the model writes summaries, updates links, reconciles claims, and keeps the wiki coherent.
How is this different from a normal RAG workflow?
RAG usually retrieves fragments from the raw corpus each time you ask a question. An LLM wiki keeps a persistent synthesis layer, so the cross-links, summaries, and contradictions already exist before the next question arrives.
Why does Obsidian show up so often in LLM wiki discussions?
Obsidian gives you a live markdown workspace, backlinks, graph view, and plugin ecosystem. That makes it a practical place for humans to browse the wiki while the LLM edits the files.
Do I need complex infrastructure to start?
No. Karpathy's pattern starts with folders of markdown files, a schema document that teaches the agent how to maintain them, and lightweight search or indexing only when scale demands it.
When does this pattern work best?
It works best for domains where you keep accumulating source material over time and want synthesis, cross-references, and reusable answers rather than one-off retrieval.