Using Faster Whisper on Windows

0. Create a script file in a text editor, save it as “transcribe.py”, selecting "All Files" as the file type when saving — not .txt.

1.1. Edit the script if transcription from audio is needed. Modify “transcribe.py” by adjusting the model size (Line 7) and/or the audio file path and name (Line 4).

transcribe.py
from faster_whisper import WhisperModel

# Path to the audio file
audio_path = r"C:\Users\User\Music\Audio.ogg"

# Model size (you can change to "base", "medium", "large-v2", etc.)
model = WhisperModel("small")

# Transcribe the audio
segments, info = model.transcribe(audio_path)

# Display the detected language and the text
print("Detected language:", info.language)

for segment in segments:
    print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")

1.2. Edit the script if transcription from video is needed. Modify “transcribe.py” by adjusting the model size (Line 37) and/or the video file path and name (Line 8).

transcribe.py
import os
import subprocess
from faster_whisper import WhisperModel
from pathlib import Path

# === Configuration ===
# Path to the video
video_path = r"C:\Users\User\Videos\video.mp4"

# Temporary folder for the extracted audio
temp_audio_path = "temp_audio.wav"

# Current user's Desktop folder
desktop_folder = str(Path.home() / "Desktop")
output_text_file = os.path.join(desktop_folder, "transcription.txt")

# === Step 1: Extract audio from video using ffmpeg ===
print(" Extracting audio from video...")
ffmpeg_command = [
    "ffmpeg", "-y",  # -y to overwrite without asking
    "-i", video_path,
    "-vn",           # No video
    "-acodec", "pcm_s16le",  # Uncompressed audio (WAV)
    "-ar", "16000",  # Sample rate
    "-ac", "1",      # Mono
    temp_audio_path
]

try:
    subprocess.run(ffmpeg_command, check=True)
except subprocess.CalledProcessError:
    print(" Error extracting audio with ffmpeg.")
    exit(1)

# === Step 2: Transcribe with Faster Whisper (you can change to "base", "medium", "large-v2", etc.) ===
print(" Transcribing audio...")
model = WhisperModel("large-v2")

segments, info = model.transcribe(temp_audio_path)

# === Step 3: Save transcription ===
print("Saving transcription...")
with open(output_text_file, "w", encoding="utf-8") as f:
    f.write(f"Detected language: {info.language}\n\n")
    for segment in segments:
        f.write(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}\n")

print(f"Transcription saved to: {output_text_file}")

# === Step 4: Remove temporary audio file ===
os.remove(temp_audio_path)
print(" Temporary audio file deleted.")

2. Open terminal in the script folder

I. Open File Explorer and navigate to the folder where you saved transcribe.py.

II. Right-click on a blank space inside the folder (not on a file).

III. Select “Open in Terminal” (on older versions of Windows, it may appear as “Open PowerShell window here”).

3. Run the script

python transcribe.py

4. Output

  • The transcription will appear directly in the terminal.

  • A .txt file named transcription.txt will be saved on your Desktop.

Last updated