Back in the day my wife was paid to transcribe audio. She did it well and was paid appropriately.
Now there are AI tools that can do it almost as well, and certainly quicker.
I would say my wife is better at understanding my accent though!
Later, she headed a team of ladies who would edit transcriptions created overseas. This sped things up and reduced costs. The outsource company used software, cutting edge at the time, and charged us per minute of audio. Their output was so-so even after the outsource team hand-edited (at extra cost), so our North American team was needed to make the transcripts fit for paying customers.
What is amazing to me is one of the best transcription platforms is absolutely free.
Free as in speech AND beer.
And it is better out of the box than the paid company.
Introducing Whisper
The tool is called Whisper, it is from OpenAI, you can read more about it here in the blog post, and it is on Github.
It doesn't require fancy hardware, but it will make use of what you have in my experience. The performance was ok on an old intel laptop and amazing on my M1 mac mini:
Not only does it transcribe audio efficiently and accurately, it can also translate. I am not multilingual so I can't tell you how well that aspect compares, but it seems to do a good job.
For developers there are C++ and Python libraries, with other wrappers such as Rust etc around those.
Install whisper and its dependencies.
pip3 install git+https://github.com/openai/whisper.git
Update whisper.
pip3 install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
Install ffmpeg:
Mac
brew install ffmpeg
Linux
sudo apt install ffmpeg
Windows
scoop install ffmpeg
Using Whisper
Simple English transcript:
whisper audio.mp3 --language English
There are many more parameters so try --help, but the main one will be --output_format because if you do not specify it does them all (I mainly use txt).
Translate every video in the folder:
for f in *.mp4;do whisper $f --language English --output_format txt --verbose False --fp16 False;done
Not a developer?
Check out MacWhisper which is a Macos GUI for the free tool. There are both free and paid options with lifetime license.
Thanks for your contribution to the STEMsocial community. Feel free to join us on discord to get to know the rest of us!
Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).
You may also include @stemsocial as a beneficiary of the rewards of this post to get a stronger support.