GitHub Repo
A speech-to-text system for Vietnamese language finetuned on OpenAI’s Whisper model with a custom speech corpus.
This builds upon the previous part of creating a custom corpus for Vietnamese using Montreal Forced Aligner (view part 1 of the project here).
For a full description of the project, please see the report.