Krzosa Karol 3d91a1f924 couple commits
2026-03-20 00:09:50 +01:00
2026-03-20 00:09:50 +01:00
2026-03-20 00:09:50 +01:00
2025-04-01 11:09:59 +02:00
2026-03-20 00:09:50 +01:00
2024-12-29 10:10:09 +01:00
2025-04-01 10:40:29 +02:00
2026-03-20 00:09:50 +01:00

wasm_transcript_browser

This project generates a searchable transcript browser from local .srt files and ships it as a WebAssembly web app.

Type a word or phrase, click a match, and it opens the exact timestamp in the matching YouTube video. You can also copy a direct timestamped link.

What it does

  • Parses subtitle files (.srt) from a local folder.
  • Generates a packed transcript index (build/entries.inc) at build time.
  • Compiles a C codebase to main.wasm and serves it with a small HTML/JS shell.
  • Provides instant text search over all transcript text.
  • Creates timestamped YouTube links from search hits.

How the pipeline works

  1. build_file.c defines a hard-coded source folder:

    • folder_to_create_transcript_for
  2. During build, src/prototype/prototype.meta.c:

    • scans .srt files in that folder,
    • parses subtitle entries,
    • normalizes text (lowercase, removes punctuation, turns - into spaces),
    • writes generated data to build/entries.inc.
  3. src/prototype/main.c includes that generated file and compiles to WASM.

  4. The browser app (package/index.html + package/main.wasm) renders a custom UI and handles link open/copy actions.

Transcript filename format

To build correct YouTube links, filenames are expected to include the 11-char YouTube ID wrapped in one character on each side at the end of the name (commonly brackets).

Example:

  • My Video Title [dQw4w9WgXcQ].en.srt
  • My Video Title [dQw4w9WgXcQ].srt

The app extracts the video ID from the ending token and creates links like:

  • https://youtu.be/<id>?feature=shared&t=<seconds>

Build

Prerequisites

  • clang (or gcc/MSVC depending on your platform)
  • Python 3 (for local static server)

Linux

./build.sh

Windows

build.bat

Build output of interest:

  • build/entries.inc (generated transcript index)
  • package/index.html
  • package/main.wasm

Run locally

From package/:

python3 -m http.server 8080

Then open:

  • http://localhost:8080

Windows helper script:

  • package/run_server.bat

Configuration notes

  • Update transcript source folder in build_file.c before building.
  • The current build file also contains hard-coded deploy commands (ssh/scp) in build_prototype_wasm_target; remove or update those for your own environment.

Project status

This is an older personal project with a custom C build/codegen stack. The rough edges are expected, but the core idea works: local transcript ingestion + fast search + one-click timestamped YouTube links.

Description
This project generates a searchable transcript browser from local .srt files and ships it as a WebAssembly web app.
Readme 1.1 MiB
Languages
C 98.3%
HTML 0.9%
Python 0.8%