diff --git a/README.md b/README.md
new file mode 100644
index 0000000..ffbc793
--- /dev/null
+++ b/README.md
@@ -0,0 +1,109 @@
+# transcript browser
+
+This project started as a **transcript browser** for subtitle/text files and later diverged into an attempt to index and search PDF content. The transcript side is the core that works best right now; PDF support exists but is still rough.
+
+## Current status
+
+- Personal/experimental codebase, now being published as part of a project archive.
+- Primary value: fast local search across subtitle/text-like files with quick jump-to-source actions.
+- PDF indexing was added later and is incomplete.
+- Build setup works, but is currently clunky and hard to follow.
+
+## What the app does
+
+- Loads files from a folder (`.srt`, `.txt`, `.html`, `.pdf`).
+- Indexes file content into memory.
+- Lets you search from a single query box.
+- Shows matching snippets.
+- Opens results in external tools:
+  - `.srt` -> media player at subtitle timestamp
+  - `.txt`/`.html` -> text editor
+  - `.pdf` -> PDF viewer at page
+
+## What is not great yet
+
+- **PDF parsing quality:** extraction is token-based and does not robustly handle Unicode/text layout.
+- **Build system readability:** custom two-stage build flow with many hardcoded source/library entries.
+- **Platform assumptions:** strongly Windows-oriented defaults (paths, commands, Win32 backend).
+- **Some UX/engineering TODOs remain:** error handling and configuration polish are still in progress.
+
+## Repository map (excluding external modules)
+
+- `build.bat` - bootstrap script for the custom build tool.
+- `build_file.cpp` - project-specific build recipe (compiles app and dependency objects).
+- `src/transcript_browser/main.cpp` - UI/event loop and app entry point.
+- `src/transcript_browser/loading_thread.cpp` - folder scanning + parsing jobs.
+- `src/transcript_browser/searching_thread.cpp` - asynchronous query matching.
+- `src/transcript_browser/read_srt.cpp` - SRT parsing.
+- `src/transcript_browser/read_pdf.cpp` - PDF text extraction attempt.
+- `src/transcript_browser/config.cpp` - config parsing/serialization and launch commands.
+- `src/basic/` - shared utilities (arena, arrays, filesystem/process/thread helpers).
+- `src/build_tool/` - custom build tool sources.
+
+## Build and run (current flow)
+
+This project currently expects a Windows + MSVC environment.
+
+1. Open a Developer Command Prompt (so `cl.exe` is available).
+2. From repo root, run:
+
+```bat
+build.bat
+```
+
+3. Run the built executable from `build/`:
+
+```bat
+build\transcript_browser.exe
+```
+
+Notes:
+
+- `build.bat` first builds `build/build_tool.exe` (if missing), then executes it.
+- The build tool compiles and runs `build_file.cpp` to produce `transcript_browser.exe`.
+- Build outputs and object files are placed in `build/`.
+
+## Runtime usage
+
+- Start the app.
+- In the input field, load a folder with:
+
+```text
+read=C:/path/to/folder
+```
+
+- Press Enter to enqueue parsing.
+- Type any query to search loaded content.
+- Use:
+  - `F1` to toggle loaded files view
+  - `F2` to edit config commands
+
+## Configuration
+
+The app stores config next to the executable as `transcript_browser.config`.
+
+Keys:
+
+- `SRTCommand`
+- `PDFCommand`
+- `TXTCommand`
+- `ReadOnStart`
+
+Supported placeholders used in commands include:
+
+- `{video}`
+- `{time_in_seconds}`
+- `{file}`
+- `{page}`
+- `{line}`
+
+If a path contains spaces, wrap it in quotes.
+
+## Build-system cleanup ideas
+
+If this project gets another iteration, high-impact cleanup would be:
+
+1. Replace or simplify the custom build chain (e.g., CMake/Meson or a smaller single-step script).
+2. Separate third-party dependency build concerns from app build logic.
+3. Remove hardcoded absolute defaults and make platform-specific commands explicit in config/docs.
+4. Add a minimal regression test path for parsing/search behavior.