“An LLM Wiki incrementally builds and maintains a persistent wiki – a structured, interlinked collection of markdown files that sits between you and the raw sources instead of just retrieving from raw documents at query time…. the wiki is a persistent, compounding artifact.” – LLM Wiki – Andrej Karpathy
An LLM Wiki is a system where a large language model (LLM) incrementally builds and maintains a persistent wiki-a structured, interlinked collection of markdown files-that serves as an intermediary knowledge layer between raw source documents and user queries, rather than relying solely on real-time retrieval from unstructured data.1 This approach transforms ad-hoc document processing into a compounding artifact, where new information integrates into an evolving graph of summaries, concept pages, entity profiles, comparisons, and syntheses, enabling deeper reasoning over accumulated knowledge.1,3,6
Core Mechanics of Compilation and Persistence
The process begins with raw sources-articles, papers, repositories, datasets, or images-dropped into a designated directory.3,6 The LLM acts as a compiler, processing only modified files incrementally to update the wiki without full recompilation.1,6 It generates markdown files including an index with summaries, dedicated pages for concepts and entities, cross-references via wiki-links, and derived outputs like charts or slides.1,3,12 This persistence addresses LLM limitations such as hallucinations and outdated knowledge by creating a grounded, human-readable knowledge substrate that grows with each addition.2,11
- Raw ingestion: Unstructured inputs land in a
raw/folder, triggering selective LLM processing.3 - Wiki generation: LLM produces
wiki/structure withINDEX.mdfor navigation,concepts/subfolders for topical articles (~100 pages at scale, totaling around 400 000 words), and backlinks.3,6 - Query interface: Users query the wiki via agents; responses generate new markdown, slides (e.g., Marp format), or visuals (e.g., matplotlib), filed back to enhance the base.6,9
- Tools integration: Obsidian serves as the frontend for browsing; naive search or CLI tools handle retrieval without vector databases.6,15
At moderate scale, this eliminates the need for embeddings or complex retrieval pipelines, as the LLM reads the index (a few thousand tokens) and relevant pages directly.6,12
Practical Implications in Knowledge Workflows
In practice, an LLM Wiki shifts token usage from code manipulation to knowledge manipulation, supporting research on topics like AI scaling laws or specific domains.3 For a researcher, it means querying multi-step questions-e.g., comparing 50 papers on a topic-that would take hours manually, now answered in context of ~400 000 words of synthesized content.3,6 Outputs persist, compounding value: a query on contradictions flags inconsistencies during ingestion, not ad-hoc at query time.12
Directory structure exemplifies simplicity:
my-research/
raw/ # Sources
wiki/ # LLM-owned
INDEX.md
concepts/
concept-a.md
output/ # Query artifacts
_meta/ # State
This yields a durable, editable artifact versus ephemeral chat responses, with the LLM owning maintenance for consistency.1,12
Contrast with Retrieval-Augmented Generation (RAG)
Traditional RAG retrieves relevant documents at query time via vector search, augmenting prompts to mitigate LLM gaps in domain knowledge, factuality, or recency.2,5,11 While effective for dynamic, knowledge-intensive tasks, RAG processes knowledge per query, lacking persistence or pre-built connections.12
| Dimension | Traditional RAG | LLM Wiki |
|---|---|---|
| When knowledge is processed | At query time (every question) | At ingest time (once per source) |
| Cross-references | Discovered ad-hoc | Pre-built and maintained |
| Contradictions | May not be noticed | Flagged during ingestion |
| Knowledge accumulation | None-starts fresh | Compounds with sources/queries |
| Output format | Ephemeral chat | Persistent markdown |
| Maintenance | Black box system | Transparent LLM-owned |
RAG excels in scale (e.g., enterprise databases) with techniques like hybrid search, recursive retrieval, or post-retrieval reranking.2 LLM Wiki suits personal or moderate-scale bases (~100 articles), prioritizing structure over speed.6,12 Advanced RAG variants (e.g., RETRO, Self-RAG) introduce iteration or adaptation, converging toward wiki-like persistence.2
Major Schools of Thought in LLM Knowledge Management
Two paradigms dominate: dynamic retrieval (RAG lineage) and static compilation (wiki-style).11,12 RAG, originating from 2020 research, emphasizes external augmentation without retraining, popular in chatbots and domain apps.5,14 Compilation approaches treat LLMs as builders of structured artifacts, echoing knowledge graphs or personal wikis like Roam/Obsidian, but automated.6
- Dynamic retrieval school: Prioritizes real-time access; variants include naive RAG (basic fetch-generate), advanced (query expansion, reranking), and modular (self-improvement).2,11
- Persistent compilation school: Builds durable structures upfront; LLM Wiki exemplifies, with incremental updates and link graphs.1,6
- Hybrid evolution: Emerging methods blend, e.g., RAG with memory or iterative retrieval.2,5
Leading Theorists and Proponents
Andrej Karpathy, former OpenAI/Tesla AI director, formalized LLM Wiki in April 2026 via GitHub Gist, describing it as his primary workflow for research knowledge bases.1,3 His insight: at personal scale, structured markdown suffices over vector search, with LLMs handling interlinking.1,12 Karpathy’s ~100-article wikis on topics demonstrate scale, using tools like Obsidian and agentic Q&A.3,6,15
Broader RAG theorists include Patrick Lewis (RETRO co-author) and teams at Google/DeepMind, advancing retrieval paradigms.2,5 Implementers like Databricks and Google Cloud promote RAG for enterprise.8,14 Community breakdowns (e.g., antigravity.codes, DAIR.AI) dissect Karpathy’s system, providing diagrams and minimum viable setups.3,6
Tensions and Debates
Scale limits one tension: LLM Wiki thrives at 400 000 words but may falter beyond without indexing aids; RAG scales via vectors.6,12 Transparency versus efficiency pits editable markdown against black-box retrieval.1,5 Incremental compilation risks drift if LLM generations inconsistent, though human oversight (reading, not editing) mitigates.3,9
Debate swirls on obsolescence: as LLMs grow (e.g., models with 10^24 FLOPs training13), need for external bases diminishes, yet domain/recency gaps persist.10,11 Cost: wiki compilation consumes upfront tokens (e.g., processing 100 docs), RAG defers to queries.2 Evaluation lacks standards; RAG benchmarks exist, but wiki efficacy is anecdotal.11
- Hallucination: Wiki flags via structure; RAG via grounding.2,12
- Update latency: Wiki incremental; RAG instant.5
- Editability: Wiki human-readable; RAG opaque.1
Strategic Relevance Today
LLM Wiki matters as knowledge work surges-researchers, analysts face info overload amid 2T-token training corpora.10,13 It enables compounding intelligence: each query enriches the base, yielding multi-hop reasoning impossible in raw RAG.3,6 In 2026, with models like Phi-2 (2,7B params, 1,4T tokens13), personal bases bridge proprietary gaps.
For teams, it prototypes team wikis; enterprises adapt for compliance via auditable markdown.12 As LLMs evolve, wiki patterns influence agentic workflows, where compilation precedes action.6 The persistent artifact endures, outlasting query sessions, positioning it as a foundational tool in AI-augmented cognition.
Implementation Considerations
Start minimal: script LLM prompts for index/concept generation, use git for versioning.3 Scale with agents for auto-filing.6 Challenges include prompt engineering for consistency (e.g., “maintain backlinks”) and storage (400 000 words ~few MB).1 Future: integrate with LLMs supporting tools for native compilation.
This system redefines RAG by front-loading structure, offering a practical path to persistent, queryable knowledge in an era of exploding data.
References
1. LLM WIki – https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
2. LLM – Extensions Wiki (XWiki.org) – 2026-03-18 – https://extensions.xwiki.org/xwiki/bin/view/Extension/LLM/
3. Retrieval Augmented Generation (RAG) for LLMs – 2026-02-01 – https://www.promptingguide.ai/research/rag
4. Karpathy’s LLM Knowledge Bases: The Post-Code AI Workflow – 2026-04-03 – https://antigravity.codes/blog/karpathy-llm-knowledge-bases
5. Large language models – Wikiversity – 2025-12-24 – https://en.wikiversity.org/wiki/Large_language_models
6. Retrieval-augmented generation – Wikipedia – 2023-11-05 – https://en.wikipedia.org/wiki/Retrieval-augmented_generation
7. LLM Knowledge Bases – DAIR.AI Academy – 2026-04-03 – https://academy.dair.ai/blog/llm-knowledge-bases-karpathy
8. llm-wiki – GitHub Gist – 2026-04-04 – https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f?permalink_comment_id=6079056
9. What is Retrieval-Augmented Generation (RAG)? – Google Cloud – https://cloud.google.com/use-cases/retrieval-augmented-generation
10. LLM Knowledge Bases post by Andrej Karpathy – DeepakNess – 2026-04-03 – https://deepakness.com/raw/llm-knowledge-bases/
11. Large language model – Wikipedia – 2023-03-09 – https://en.wikipedia.org/wiki/Large_language_model
12. Retrieval-Augmented Generation for Large Language Models – arXiv – 2023-12-18 – https://arxiv.org/abs/2312.10997
13. Karpathy’s LLM Wiki: The Complete Guide to His Idea File – 2026-04-04 – https://antigravity.codes/blog/karpathy-llm-wiki-idea-file
14. List of large language models – Wikipedia – 2023-03-09 – https://en.wikipedia.org/wiki/List_of_large_language_models
15. What is Retrieval Augmented Generation (RAG)? – Databricks – 2023-10-18 – https://www.databricks.com/blog/what-is-retrieval-augmented-generation
16. Andrej Karpathy’s LLM-powered personal knowledge base workflow … – 2026-04-03 – https://www.youtube.com/watch?v=VLd0K0bkOIE

