couple commits
This commit is contained in:
93
README.md
Normal file
93
README.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# wasm_transcript_browser
|
||||
|
||||
This project generates a searchable transcript browser from local `.srt` files and ships it as a WebAssembly web app.
|
||||
|
||||
Type a word or phrase, click a match, and it opens the exact timestamp in the matching YouTube video. You can also copy a direct timestamped link.
|
||||
|
||||
## What it does
|
||||
|
||||
- Parses subtitle files (`.srt`) from a local folder.
|
||||
- Generates a packed transcript index (`build/entries.inc`) at build time.
|
||||
- Compiles a C codebase to `main.wasm` and serves it with a small HTML/JS shell.
|
||||
- Provides instant text search over all transcript text.
|
||||
- Creates timestamped YouTube links from search hits.
|
||||
|
||||
## How the pipeline works
|
||||
|
||||
1. `build_file.c` defines a hard-coded source folder:
|
||||
|
||||
- `folder_to_create_transcript_for`
|
||||
|
||||
2. During build, `src/prototype/prototype.meta.c`:
|
||||
|
||||
- scans `.srt` files in that folder,
|
||||
- parses subtitle entries,
|
||||
- normalizes text (lowercase, removes punctuation, turns `-` into spaces),
|
||||
- writes generated data to `build/entries.inc`.
|
||||
|
||||
3. `src/prototype/main.c` includes that generated file and compiles to WASM.
|
||||
|
||||
4. The browser app (`package/index.html` + `package/main.wasm`) renders a custom UI and handles link open/copy actions.
|
||||
|
||||
## Transcript filename format
|
||||
|
||||
To build correct YouTube links, filenames are expected to include the 11-char YouTube ID wrapped in one character on each side at the end of the name (commonly brackets).
|
||||
|
||||
Example:
|
||||
|
||||
- `My Video Title [dQw4w9WgXcQ].en.srt`
|
||||
- `My Video Title [dQw4w9WgXcQ].srt`
|
||||
|
||||
The app extracts the video ID from the ending token and creates links like:
|
||||
|
||||
- `https://youtu.be/<id>?feature=shared&t=<seconds>`
|
||||
|
||||
## Build
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- `clang` (or `gcc`/MSVC depending on your platform)
|
||||
- Python 3 (for local static server)
|
||||
|
||||
### Linux
|
||||
|
||||
```bash
|
||||
./build.sh
|
||||
```
|
||||
|
||||
### Windows
|
||||
|
||||
```bat
|
||||
build.bat
|
||||
```
|
||||
|
||||
Build output of interest:
|
||||
|
||||
- `build/entries.inc` (generated transcript index)
|
||||
- `package/index.html`
|
||||
- `package/main.wasm`
|
||||
|
||||
## Run locally
|
||||
|
||||
From `package/`:
|
||||
|
||||
```bash
|
||||
python3 -m http.server 8080
|
||||
```
|
||||
|
||||
Then open:
|
||||
|
||||
- `http://localhost:8080`
|
||||
|
||||
Windows helper script:
|
||||
|
||||
- `package/run_server.bat`
|
||||
|
||||
## Configuration notes
|
||||
|
||||
- Update transcript source folder in `build_file.c` before building.
|
||||
- The current build file also contains hard-coded deploy commands (`ssh`/`scp`) in `build_prototype_wasm_target`; remove or update those for your own environment.
|
||||
|
||||
## Project status
|
||||
|
||||
This is an older personal project with a custom C build/codegen stack. The rough edges are expected, but the core idea works: local transcript ingestion + fast search + one-click timestamped YouTube links.
|
||||
Reference in New Issue
Block a user