Using AI to transcibe audio and batch rename files using Python3 on MacOS

I have a ton of audio files on my computer from some old video games. The idea is to use some of these audio files as samples in electronic music, but the issue is they are all generic names not giving an idea of the content of the file. I could have gone through one by one and found the ones I wanted, or even manually renamed the files, however starting with 2000+ mp3 files to transcribe and that would of course be a lengthy and arduous task. I found some free/paid app that transcribes audio but wasn't really what I was looking for... i couldn't seem to find anything that fit the bill. It wasn't going to automatically go through files for me and therefore wasn't worth my time. Whats funny is a lot of these apps and websites that use AI and have you pay for it are actually free ai models dressed up fancy but the underlying thing that's making it work is free and open source and fairly simple to get started with.

The place where I got these audio samples specifically is The Sounds Resource which is really good for all sorts of video game audio files, while of course keeping in mind that many of these sounds may involve copyright issues since they are taken from commercial video games. Another great source of Audios is freesound and this is also very good because you can search copyright free "creative commons" zero attribution "CC0" licenced audiofiles which are free to use in any way you please.

The Script is a bit basic and involves installing some dependencies. To begin I had to install homebrew, a package manager for Mac and Linux.

Step 1: Install Homebrew (the official way)

Open Terminal and paste exactly this:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Press Return, then:

Enter your macOS password when prompted

Let it run. It takes a minute or two.

Step 2: Add Homebrew to your PATH (this part matters)

When the installer finishes, Put these commands into the terminal depending on what type of mac

On Apple Silicon (M1 / M2 / M3)

echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> ~/.zprofile
eval "$(/opt/homebrew/bin/brew shellenv)"

On Intel Macs

echo 'eval "$(/usr/local/bin/brew shellenv)"' >> ~/.zprofile
eval "$(/usr/local/bin/brew shellenv)"

If you’re unsure which Mac you have:

uname -m

arm64 → Apple Silicon

x86_64 → Intel

Step 3: Verify

brew --version

If you see a version number, the ritual succeeded. ✅

Step 4: Continue with audio renaming setup

Now you can run:

brew install ffmpeg
pip3 install openai-whisper

Verify Whisper exists:

whisper --help

If that prints options, you’re locked in.

Step 5: Prep your audio folder

Create a folder anywhere, for example:

~/AudioToRename

Put copies of your audio files inside (.wav .mp3 .m4a .aiff are all fine)

Step 6: Create the renaming script

From Terminal, use the cd command to navigate to the folder you created. Once you are there use nano to create the first script:

nano transcribe_one.py

This will open the nano text editor. Paste in the following script:

import whisper
import sys
import os
import re

# Ensure a file path is provided
if len(sys.argv) < 2:
    print("Usage: python3 transcribe_one.py <audio_file_path>")
    sys.exit(1)

audio_path = sys.argv[1]

if not os.path.exists(audio_path):
    print(f"File not found: {audio_path}")
    sys.exit(1)

print("Worker started")
print("Audio:", audio_path)

model = whisper.load_model("base")

def safe_name(text, max_len=60):
    text = text.lower()
    text = re.sub(r'[^a-z0-9\s-]', '', text)
    text = re.sub(r'\s+', '_', text).strip('_')
    return text[:max_len]

# Transcribe
result = model.transcribe(audio_path, fp16=False)
text = result["text"].strip()
print("Transcript:", text[:80])

if not text:
    print("No speech detected")
    sys.exit(0)

first_sentence = text.split(".")[0]
new_base = safe_name(first_sentence)

folder, old_name = os.path.split(audio_path)
ext = os.path.splitext(old_name)[1]

# Handle duplicates
new_path = os.path.join(folder, new_base + ext)
counter = 1
while os.path.exists(new_path):
    new_path = os.path.join(folder, f"{new_base}-{counter}{ext}")
    counter += 1

os.rename(audio_path, new_path)
print("Renamed successfully to:", os.path.basename(new_path))

For the model we can choose different options:
⚡ Fastest (rough but usable)
MODEL_SIZE = "tiny"

🎯 Best balance
MODEL_SIZE = "base"

🧠 Studio-grade accuracy
MODEL_SIZE = "small"

The default is base and takes up about 130MB and seemed to be sufficient for me. But if you wanted to change it you would change the model = whisper.load_model("base") to another one of the 2 options.

Save:

Ctrl + O, Enter
Ctrl + X, to Quit Nano

The second script:

nano batch_rename.py

Paste the following into nano:

import os
import subprocess
import sys

# Use the folder where the script is located
FOLDER = os.path.dirname(os.path.abspath(__file__))

# List audio files
audio_files = [
    f for f in os.listdir(FOLDER)
    if f.lower().endswith((".wav", ".mp3", ".m4a", ".aiff", ".flac", ".ogg", ".caf"))
]

print(f"Found {len(audio_files)} audio files in {FOLDER}")

# Process files one by one
for i, file in enumerate(audio_files, 1):
    path = os.path.join(FOLDER, file)
    print(f"\n[{i}/{len(audio_files)}] Processing {file}")

    result = subprocess.run(
        [sys.executable, os.path.join(FOLDER, "transcribe_one.py"), path],
        text=True,
        cwd=FOLDER
    )

    print("Exit code:", result.returncode)

Save:

Ctrl + O, Enter
Ctrl + X, to Quit Nano

Step 7: Run the script 📜

After saving and quitting Nano a second time you should have 2 python scripts in your folder with your audios, one called transcribe_one.py and one called batch_rename.py. To run the script call on the batch_rename script which will run the transcribe script on every file in the folder. If there happens to be 2 or more files with the same resulting filename, transcribe will add a number to the end of the file. It only takes the first sentence or so since filenames need to be relatively short.

python3 batch_rename.py

The first time you run the script it will download the 'base' (or other) model...
The base model is about 130 MB. I had an issue with this the first time and had to do the following line of code in the terminal.

/Applications/Python\ 3.12/Install\ Certificates.command

I think I had to update after pip as well but the terminal had the right idea by telling me what to do. After that I didn't have any issues and was able to go through thousands of files in the span of a few hours. The script and transcription all runs locally on your computer and the time will vary depending on how fast your computer is.

The resulting transcription is pretty good, there are some minor errors like Tracer Tong getting swapped to Tongue but other than that I really can't complain; worked like a charm.