

Live Captions is an application that provides live captioning for the Linux desktop. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification. Whisper is a general-purpose speech recognition model. If conversion didn't work, convert with FFmpeg: ffmpeg -i myTitle.en.vtt output.srt In one line and simplified: yt-dlp -write-auto-sub -write-sub -sub-lang en -convert-subs srt -skip-download vidURLorID
#AEGISUB GUI RESIZE DOWNLOAD#
ignore-errors vidURLorID # Continue on download errors, for example to skip unavailable videos in a playlist

skip-download \ # Do not download the video o "~/%(uploader)s/%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" \ # OUTPUT TEMPLATE convert-subs srt \ # Convert the subtitles to other format (currently supported: srt|ass|vtt|lrc) sub-lang en,de,es \ # Languages of the subtitles to download (optional) separated by commas, use -list- subs for available language tags The quest continues to find a program that uses CMU Sphinx for rudimentary speech to text (which would set the correct timings as well), as YouTube already does.įor the ones who do accept having to temporarily upload the video to YouTube (is mandatory to select video language) to get its subtitle (close caption, lyrics): Is possible to extract/download it with youtube-dl or yt-dlp: yt-dlp -write-auto-sub \ # Write automatically generated subtitle file (YouTube only)
#AEGISUB GUI RESIZE CODE#
In addition, one subtitle tool is aware of this CMU Sphinx feature (web based tool), however there is no reference in the latest source code that they added CMU Sphinx. It is possible to use CMU Sphinx with a subtitle program according to this post.

#AEGISUB GUI RESIZE UPDATE#
Update #2: There is Speech-to-Text software for Linux, with the CMU Sphinx package. My biggest requirement is to have the program automatically find the start/stop for each sentence, so that I write the text in it. srt subtitles only, and do not need to hard code them on the videos. Is it possible to do the subtitles efficiently on Ubuntu? However, I would rather not upload the videos to YouTube just to get my subtitles. YouTube has the above features (creates rudimentary text subtitles at the correct timings, using speech-to-text). You need to select yourself the start and stop for each sentence. However, it requires extensive effort to create those subtitles manually.
