Skip to content

Installation

As a CLI tool

Requires ffmpeg on your PATH.

pip install speech-mine

Or with pipx to install into an isolated environment:

pipx install speech-mine

Or with uv:

uv tool install speech-mine

As a library dependency

pip:

pip install speech-mine

uv:

uv add speech-mine

pyproject.toml:

dependencies = [
    "speech-mine",
]

HuggingFace Token

The extract module requires a HuggingFace token to download pyannote models:

  1. Create account at huggingface.co
  2. Go to Settings → Access Tokens → New token (read permissions)
  3. Accept the user agreement at pyannote/speaker-diarization-3.1

Pass the token via --hf-token YOUR_TOKEN on every extract call.

Requirements

  • Python 3.11+
  • ffmpeg installed and on PATH
  • GPU recommended for faster processing (not required)