With each speaker separately using new in-house programs!
How to create subtitles, transcripts and translations mostly autmatically in a few simple steps:
- Get the link to the files: Whaletank_7-10-2017_Multichannel_Recordings
- Extract your favourite project
Convert them to a video format by using theFind theffmpeg
code belowmkv
versions ready to be uploaded to YouTube.- Upload the created video files (MKV format) you want to your YouTube channel by dragging and dropping them all at once if you wish to do so.
- Wait a while for the subtitles to generate automatically
- Edit your subtitles on YouTube
- Download the
srt
from YouTube - Merge the subtitles and create a transcript in 1 step! With a new java tool called
srt2vtt
developed by @AlexPMorris just for our purpose!
Steps broken down:
here are all the files (of Whaletank talk number 227) in one zipfile:
Whaletank_7-10-2017_Multichannel_Recordings
Record in multichannel
So when you record choose "Multichannel" as you see in the image
Convert the Mumble recordings to a video format
Here's the code how tos to video which can be batch uploaded to YouTube by just drag and drop. You have to be in the directory with the audio files and set the
img
variable to an image you'd like to use for the video. It will look like this.Set the
img
variable:img=SomeImage.png
Best is if it small for speed, the smaller the better, because the encoding will be much faster.I've updated the code slightly so that it won't complain about images as much.
for file in *.ogg; do ffmpeg -loop 1 -r 2 -i "$img" -i "$file" -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" -c:v libx264 -preset slow -tune stillimage -crf 18 -c:a copy -shortest -pix_fmt yuv420p -threads 0 "$file".mkv; done
Edit your subtitles
on YouTube by adding punctuation, capitalization and edit out everything which doesn't make sense like repeated words and stop words (So, you know, like etc.) and wrongly interpreted words by YouTube like
bit chairsBitSharesDownload the subtitles when you are done.
Mix the audiotracks by using
ffmpeg -i input1.ogg -i input2.ogg -i input3.ogg -i input4.ogg -i input5.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amix=inputs=5:duration=longest[aout]" -map "[aout]" -ac 2 -c:a libvorbis -b:a 128k output.ogg
and adjust thecode
to the number of input audiotracks you have like so:[0:a][1:a][2:a][3:a][4:a][5:a]
for six (6) audio track instead of 5. Also setamix=inputs=6
the rest stays the same.Merge the subtitles
We can now use the
java
program from @AlexPMorris calledsrt2vtt
. You can take the individual subtitle files and the program will insert the speaker names automatically. Isn't that neat? Thus we eliminate one step. The syntax is:java sr2vtt merge -i input1 -i input2 merged-output-subtitle-file.vtt output=transcript.htm
You have to repeat it for all subtitle files, adding eacj speaker's subtitles to the merged ones. This means using the
merged-output-subtitle-file.vtt
as the input like this:java sr2vtt merge -i merged-output-subtitle-file.vtt -i input3 merged-output-subtitle-file.vtt
You might want to addoutput=transcript.htm
as the last part. The program will create hyperlinks from the speaker names which will take you to the speaker's Steemit pages.Upload the merged subtitle to YouTube
and choose "Upload a file" and "Publish"
Now your done with subitles _and the transcriptions!
Subtitle Edit
This program http://www.nikse.dk/SubtitleEdit/ for Windows is amazing. I tried it on a Windows computer (I have a Mac) when I loaded an anime movie to test it automatically fetched japanese subtitles (which I never had on my harddrive because I have no use for it) and after a few seconds I saw them being replaced by English ones! I was very surprised. I knew it could use Google translate but I didn't know it would fetch any other languages it somehow deducted from the title name I presume. Amazing. And this program is opensource and free.
This program could be handy if your PC is too slow with the YouTube webapp like @chuckyfucky mentioned. But he hasn't tested it yet.
There is an online version as well here: http://www.nikse.dk/SubtitleEdit/Online#
If it only could edit the audio as well...
Yes, if we could edit audio as we edit the subtitle text, that would be a great help and a great product for podcasters. Having subtitles or a transcript for your video or podcast is not only much more engaging for the listeners but it will be indexed by search engines as well. This means you could search for the search terms and the search engine could find those search terms in the transcript or subtitles!
But listeners will be able to read along as well and thus absorb the information more effectivly which any learning pyramid image will show. The engagment has been measured to be 58% higher.
Other cool tricks
translation
Subtitle Edit can translate a subtitle by using Google translate or Multi Translator (only Swedish to Danish).
If you and a few friends want to translate the same subtitle at the same time over the internet, then do try the "Networking" feature
spell check
advanced replace
In Edit -> Multiple replace you can create your own rules for fixing a subtitle - even advanced rules using regular expressions!
When writing regular expressions grouping and backreferences are very useful. Parts of a regular expression inside parentheses are groups and can be referenced in the replace string where $1 is a reference to the first group and $2 is a reference to the second and so on.
Useful and interesting links:
Download subtitles from YouTube, even automatic captions
Amara
https://amara.org/cs/about-amara/
Amara can help to add translations to videos. Check it out!
YouTube API
Maybe we can use it in the future
https://developers.google.com/youtube/v3/docs/captions/list#try-it
https://developers.google.com/youtube/v3/docs/captions
https://developers.google.com/youtube/v3/getting-started
Why captions are important?
Google I/O 2011: The YouTube Caption API, Speech Recognition, and WebVTT captions for HTML5
Closing comments
It's very cool to be able to read along what's being said. It's much easier to read what website or speaker is being reffered to than to have to listen to it. Also you get to know people by clicking on their nickname in the transcript and perhaps follow them on Steemit.
Furthermore with interactive transcripts you could jump to any section which interest you. But now we can add just time which you can use to jump to a particular section. No need to watch the whole video or listen to a podcast which might not be so interesting at the end.
It might be not obvious for most of us but the more you know the more you will waste your time with things you already know or are even inaccurate.
Let me know if you need subtitles for your project!
- Discord: nutela#1442 or chuckyfucky#4480
- Steem: @nutela or @ chuckyfucky
- Or in the comments below!
If you can upvote our project I can maybe reach more people and get funding for people who help out and even translate this stuff. Thank you.
And if you want you can help us out and earn too! #steemjobs
you have a much knowledge really appreciate able
I am steemit new-bie. Yout post is very intersting
keep up the good work friend.
Thanks!