karpathy/llm-wiki.md

Created April 4, 2026 16:25

Star (5,000+) You must be signed in to star a gist
Fork (5,000+) You must be signed in to fork a gist

Select an option

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f.js"></script>
Save karpathy/442a6bf555914893e9891c11519de94f to your computer and use it in GitHub Desktop.

Download ZIP

llm-wiki

Raw

llm-wiki.md

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you.

The core idea

Most people's experience with LLMs and documents looks like RAG: you upload a collection of files, the LLM retrieves relevant chunks at query time, and generates an answer. This works, but the LLM is rediscovering knowledge from scratch on every question. There's no accumulation. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every time. Nothing is built up. NotebookLM, ChatGPT file uploads, and most RAG systems work this way.

The idea here is different. Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources. When you add a new source, the LLM doesn't just index it for later retrieval. It reads it, extracts the key information, and integrates it into the existing wiki — updating entity pages, revising topic summaries, noting where new data contradicts old claims, strengthening or challenging the evolving synthesis. The knowledge is compiled once and then kept current, not re-derived on every query.

This is the key difference: the wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read. The wiki keeps getting richer with every source you add and every question you ask.

You never (or rarely) write the wiki yourself — the LLM writes and maintains all of it. You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work — the summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time. In practice, I have the LLM agent open on one side and Obsidian open on the other. The LLM makes edits based on our conversation, and I browse the results in real time — following links, checking the graph view, reading the updated pages. Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase.

This can apply to a lot of different contexts. A few examples:

Personal: tracking your own goals, health, psychology, self-improvement — filing journal entries, articles, podcast notes, and building up a structured picture of yourself over time.
Research: going deep on a topic over weeks or months — reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis.
Reading a book: filing each chapter as you go, building out pages for characters, themes, plot threads, and how they connect. By the end you have a rich companion wiki. Think of fan wikis like Tolkien Gateway — thousands of interlinked pages covering characters, places, events, languages, built by a community of volunteers over years. You could build something like that personally as you read, with the LLM doing all the cross-referencing and maintenance.
Business/team: an internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls. Possibly with humans in the loop reviewing updates. The wiki stays current because the LLM does the maintenance that no one on the team wants to do.
Competitive analysis, due diligence, trip planning, course notes, hobby deep-dives — anything where you're accumulating knowledge over time and want it organized rather than scattered.

Architecture

There are three layers:

Raw sources — your curated collection of source documents. Articles, papers, images, data files. These are immutable — the LLM reads from them but never modifies them. This is your source of truth.

The wiki — a directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview, a synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent. You read it; the LLM writes it.

The schema — a document (e.g. CLAUDE.md for Claude Code or AGENTS.md for Codex) that tells the LLM how the wiki is structured, what the conventions are, and what workflows to follow when ingesting sources, answering questions, or maintaining the wiki. This is the key configuration file — it's what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this over time as you figure out what works for your domain.

Operations

Ingest. You drop a new source into the raw collection and tell the LLM to process it. An example flow: the LLM reads the source, discusses key takeaways with you, writes a summary page in the wiki, updates the index, updates relevant entity and concept pages across the wiki, and appends an entry to the log. A single source might touch 10-15 wiki pages. Personally I prefer to ingest sources one at a time and stay involved — I read the summaries, check the updates, and guide the LLM on what to emphasize. But you could also batch-ingest many sources at once with less supervision. It's up to you to develop the workflow that fits your style and document it in the schema for future sessions.

Query. You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms depending on the question — a markdown page, a comparison table, a slide deck (Marp), a chart (matplotlib), a canvas. The important insight: good answers can be filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn't disappear into chat history. This way your explorations compound in the knowledge base just like ingested sources do.

Lint. Periodically, ask the LLM to health-check the wiki. Look for: contradictions between pages, stale claims that newer sources have superseded, orphan pages with no inbound links, important concepts mentioned but lacking their own page, missing cross-references, data gaps that could be filled with a web search. The LLM is good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows.

Indexing and logging

Two special files help the LLM (and you) navigate the wiki as it grows. They serve different purposes:

index.md is content-oriented. It's a catalog of everything in the wiki — each page listed with a link, a one-line summary, and optionally metadata like date or source count. Organized by category (entities, concepts, sources, etc.). The LLM updates it on every ingest. When answering a query, the LLM reads the index first to find relevant pages, then drills into them. This works surprisingly well at moderate scale (~100 sources, ~hundreds of pages) and avoids the need for embedding-based RAG infrastructure.

log.md is chronological. It's an append-only record of what happened and when — ingests, queries, lint passes. A useful tip: if each entry starts with a consistent prefix (e.g. ## [2026-04-02] ingest | Article Title), the log becomes parseable with simple unix tools — grep "^## \[" log.md | tail -5 gives you the last 5 entries. The log gives you a timeline of the wiki's evolution and helps the LLM understand what's been done recently.

Optional: CLI tools

At some point you may want to build small tools that help the LLM operate on the wiki more efficiently. A search engine over the wiki pages is the most obvious one — at small scale the index file is enough, but as the wiki grows you want proper search. qmd is a good option: it's a local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking, all on-device. It has both a CLI (so the LLM can shell out to it) and an MCP server (so the LLM can use it as a native tool). You could also build something simpler yourself — the LLM can help you vibe-code a naive search script as the need arises.

Tips and tricks

Obsidian Web Clipper is a browser extension that converts web articles to markdown. Very useful for quickly getting sources into your raw collection.
Download images locally. In Obsidian Settings → Files and links, set "Attachment folder path" to a fixed directory (e.g. raw/assets/). Then in Settings → Hotkeys, search for "Download" to find "Download attachments for current file" and bind it to a hotkey (e.g. Ctrl+Shift+D). After clipping an article, hit the hotkey and all images get downloaded to local disk. This is optional but useful — it lets the LLM view and reference images directly instead of relying on URLs that may break. Note that LLMs can't natively read markdown with inline images in one pass — the workaround is to have the LLM read the text first, then view some or all of the referenced images separately to gain additional context. It's a bit clunky but works well enough.
Obsidian's graph view is the best way to see the shape of your wiki — what's connected to what, which pages are hubs, which are orphans.
Marp is a markdown-based slide deck format. Obsidian has a plugin for it. Useful for generating presentations directly from wiki content.
Dataview is an Obsidian plugin that runs queries over page frontmatter. If your LLM adds YAML frontmatter to wiki pages (tags, dates, source counts), Dataview can generate dynamic tables and lists.
The wiki is just a git repo of markdown files. You get version history, branching, and collaboration for free.

Why this works

The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages. Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored, don't forget to update a cross-reference, and can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance is near zero.

The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.

The idea is related in spirit to Vannevar Bush's Memex (1945) — a personal, curated knowledge store with associative trails between documents. Bush's vision was closer to this than to what the web became: private, actively curated, with the connections between documents as valuable as the documents themselves. The part he couldn't solve was who does the maintenance. The LLM handles that.

Note

This document is intentionally abstract. It describes the idea, not a specific implementation. The exact directory structure, the schema conventions, the page formats, the tooling — all of that will depend on your domain, your preferences, and your LLM of choice. Everything mentioned above is optional and modular — pick what's useful, ignore what isn't. For example: your sources might be text-only, so you don't need image handling at all. Your wiki might be small enough that the index file is all you need, no search engine required. You might not care about slide decks and just want markdown pages. You might want a completely different set of output formats. The right way to use this is to share it with your LLM agent and work together to instantiate a version that fits your needs. The document's only job is to communicate the pattern. Your LLM can figure out the rest.

watsonrm commented Jun 5, 2026

People keep reaching for branch-and-merge here, and it's the right instinct — but it solves a different layer than the one that actually corrupts an LLM wiki, so I want to be precise about where it helps and where it doesn't.

Git merge resolves textual conflicts: same line, two edits. The failure mode that actually wrecks a wiki is semantic: two agents independently write the same fact in different words. One appends "decided to ship path B," another appends "going with option B for the trigger." Git sees two non-overlapping additions, merges both cleanly, and now the wiki holds a duplicate. A clean merge isn't a correct merge. So branch-and-merge doesn't remove the need for dedup — you still have to grep for the fact before you write it. That dedup step is exactly what a single-writer discipline was buying you for free, which is why people who start with merge end up rebuilding it anyway.

The deeper point is that single-writer isn't a ceiling, it's just the cheapest way to get zero merges. The scalable version of multi-writer isn't "add a merge step," it's modeling the data so concurrent writes commute in the first place. Append-only logs, one entry per line, partitioned by file or section. If every agent only ever appends its own line and nothing reorders, git auto-merges every time, no conflict, no resolver. That's effectively single-writer-per-file with many writers per wiki — which is what you actually want, and it's a data-modeling choice, not a coordination one.

What's left after that is small. Worktree isolation per writer (cheap, and most agent runtimes already give it to you). A deterministic merge driver for the rare genuine same-region collision — sorted append, last-writer-wins on structured fields — and emphatically not an LLM resolving conflicts, because that just reintroduces the nondeterminism you were trying to bound. And grep-dedup stays, permanently, as the semantic layer git can't see.

So: branch-and-merge is the right transport. It sits under idempotent, commutative writes; it doesn't replace them. The hard part was never the merge mechanics — it's making the writes dedup-safe and order-independent before they ever hit the tree.

watsonrm commented Jun 5, 2026

@brtrx the randomised-sequence trick is smart — it's the right patch for an ordering problem. Worth pushing one layer down though: the reason order biases the wiki in the first place is that ingest isn't idempotent. If every claim carries its source citation and you grep for that citation before writing, re-ingesting in a different order converges to the same state — the structural bias just stops being reachable. Ingest becomes commutative, and the lint only has to handle genuine semantic conflicts, not order artifacts.

Two things that have held up at the same scale: contradiction detection should key on a stable claim identity (the citation), not textual proximity, so a batched pass and a single pass produce the same result regardless of sequence. And when the lint finds a real conflict, route it to a review queue rather than auto-resolving — a high-confidence merge can auto-apply, but a decision reversal is exactly the case you want a human to see. The persistent scratchpad is the right primitive; I'd just make the thing it dedups against an identity, not a position.

watsonrm commented Jun 5, 2026

@beckfexx closest setup to mine I've seen in the thread — 6 specialized agents, contradiction detection, nightly self-heal. One thing I'd challenge: the advisory locks and handover protocols are probably the part you can delete. Locks are a symptom of multiple writers contending for the same file. Partition writers by file/section and keep writes append-only, and the contention disappears — many agents per wiki, but one writer per target. Biggest reliability win I've had.

On contradiction detection, I'd shift it left. Catching conflicts at 2,000+ memories with five strategies is detection after the fact; if each write greps for the claim's citation first, you reject the duplicate at write time and the nightly pass shrinks to genuine semantic reversals. And on self-heal — autonomous overwrite of "stale" knowledge is the one place silent corruption creeps in. I gate that with confidence tiers: high-confidence updates auto-apply, anything ambiguous flags for review instead of overwriting. Cheap insurance against the 2:30 AM job quietly rewriting something that was correct.

nishchay7pixels commented Jun 5, 2026

ctxslice (https://github.com/llcortex/ContextSlice.git)

A semantic context pre-processor that gives coding agents only the code they need.

Prune your codebase to only what's relevant before the LLM sees it.

Inspired by MiniMax M3's MSA architecture — sparse attention with pre-filtering.
Instead of attending to everything, identify which blocks matter and drop the rest.

ContextSlice is a sparse context compiler for AI coding agents.

paulmchen commented Jun 5, 2026

Synthadoc Community Edition v0.7.0 is released.

Three new features focused on interactivity, speed, and visibility into the wiki's health:

Local Web Chat UI: "synthadoc web" CLI starts a local chat interface in your browser, local-only, nothing leaves your machine. Use it the same way you use the CLI: ask questions about your wiki's domain and get streamed, cited answers.

Beyond content queries, the UI also auto-detects the health of your wiki (Explorer, Health Check, or Power User mode) and surfaces context-appropriate prompts: if your wiki has orphan pages or contradictions, those show up as suggested actions rather than generic hints. You can also give operational commands directly in the chat: "run lint", "show wiki status", "schedule ingest every night at 9 PM", the Action Agent parses those and executes them live against your wiki, with results shown inline.

→ Quick-start demo: Step 22 (https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-22--use-the-web-chat-ui)

Streaming Query: responses now stream token-by-token instead of waiting for the full answer. Works in the CLI, in Obsidian, and in the new web UI. The first token appears almost immediately after you ask; citations and confidence scores appear at the end of the stream once the full answer is assembled. If you need a cached static result, --no-stream still gives you the old behaviour.

→ Quick-start demo: Step 5 (https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-5--query-the-pre-built-wiki-cli--obsidian)

Query Result Caching: repeated queries return instantly from a local cache instead of hitting the LLM again. The cache is shared across the CLI, the web UI, and the Obsidian plugin, keyed on the question text, the current wiki epoch, and the model. The wiki epoch increments automatically on every ingest and every lifecycle state change, so the moment your wiki changes, all prior cached answers are bypassed immediately.

→ Quick-start demo: Step 23 (https://github.com/axoviq-ai/synthadoc/blob/main/docs/user-quick-start-guide.md#step-23--query-caching)
Also in v0.7.0: expanded built-in hints covering orphan pages, adversarial warnings, and lifecycle state queries; live lint-report and wiki-status summaries available directly from the chat UI.

If any of this is useful or you have thoughts on the direction, feedback is very welcome, and a ⭐ always helps the project reach more people.

👉 https://github.com/axoviq-ai/synthadoc

skyllwt commented Jun 6, 2026

Love this LLM-Wiki idea? We built a full open-source system on it → AutoSci

Come and enjoy: https://github.com/skyllwt/AutoSci

Karpathy's pattern (immutable raw/ → an LLM-compiled wiki/ → a CLAUDE.md schema) is the exact foundation of AutoSci — an agent that turns the wiki into a research memory and then does autonomous science on top of it. All on Claude Code:

📚 Ingest papers into a cross-linked wiki of concept/method/idea pages ([[wikilinks]], contradiction edges — the LLM-Wiki, fully realized)
💡 Ideate → experiment → write: it reads its own memory to generate ideas, design + run experiments, draft the paper, and handle rebuttals
🧬 Self-evolving memory: between projects it consolidates, re-weights, and re-links the wiki (a "sleep" phase)
🕸️ Multi-agent DAGs for the hard reasoning steps

We've already used it to write 3 papers end-to-end. Fully open (MIT).

⭐ If this is where you want LLM-Wikis to go, a star genuinely helps us！ and we welcome issues/PRs/Contributors
👉 https://github.com/skyllwt/AutoSci
📄 Paper: https://arxiv.org/abs/2605.31468

Demo: 【北大做了一个会自我进化的科研 Agent：AutoSci】 https://www.bilibili.com/video/BV19gVg6pEk6/?share_source=copy_web&vd_source=338de971cb27f42aaaf5d8bfdeed04b3

RED: 北大做了一个“越做科研越聪明”的AI科学家北大团队... http://xhslink.com/o/2clEkgugEPw
复制后打开【小红书】查看笔记！

vvvvvivekkk commented Jun 6, 2026

Interesting to see so many LLM-Wiki implementations emerge from this gist.

I took a slightly different direction with LLM-Wiki-v3.

Most systems stop at "LLM-maintained markdown + retrieval." V3 tries to make the wiki a reliable long-term knowledge substrate rather than just a generated documentation layer.

Key ideas:

• Markdown + Git are the only source of truth. Every vector index, BM25 index, graph, and retrieval artifact is disposable and fully rebuildable.

• Provenance is non-negotiable. Instead of confidence scores, every claim traces back to sources, spans, extractors, and timestamps. Trust is reconstructed, not guessed.

• Knowledge doesn't decay. Facts are superseded, not deleted. The system preserves what was believed, when it was believed, and what replaced it.

• Retrieval is hybrid by design (BM25 + dense + graph + RRF + reranking), but retrieval is treated as infrastructure—not memory itself.

• Autonomous writes are gated. Agents cannot directly mutate knowledge. Changes flow through evaluation, attribution checks, and audit trails before becoming part of the wiki.

• Every operation is auditable and reversible. The goal is knowledge evolution with accountability.

• The wiki is domain-agnostic. The real product is the schema. Change the schema and the same engine becomes a research wiki, company brain, scientific memory system, or personal knowledge substrate.

My belief is that the next generation of AI memory systems won't be vector databases or chat histories. They'll be versioned knowledge systems with provenance, evaluation gates, and explicit evolution paths.

LLM-Wiki-v3 is my attempt at exploring that direction.

https://github.com/vvvvvivekkk/LLM-Wiki-v3