From 7d70dcafb684e969d67fbc89f6b3cfc8a29d7300 Mon Sep 17 00:00:00 2001 From: cjs Date: Sat, 27 Dec 2025 03:59:25 +0000 Subject: [PATCH] Add ROCM.md with Debian Sid setup instructions --- ROCM.md | 126 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 126 insertions(+) create mode 100644 ROCM.md diff --git a/ROCM.md b/ROCM.md new file mode 100644 index 0000000..237f7fb --- /dev/null +++ b/ROCM.md @@ -0,0 +1,126 @@ +# WhisperX ROCm Fork + +This is a fork of [m-bain/whisperX](https://github.com/m-bain/whisperX) configured for AMD ROCm GPUs. + +## Tested Configuration + +- **OS**: Debian Sid (trixie/forky) +- **GPU**: AMD Radeon RX 7700 XT (gfx1101, 12GB VRAM) +- **ROCm**: 7.1.1 +- **Python**: 3.10 +- **PyTorch**: 2.11.0+rocm7.0 (nightly) +- **CTranslate2**: 4.6.2 (ROCm build from [paralin/ctranslate2-rocm](https://github.com/paralin/ctranslate2-rocm)) + +## Prerequisites + +1. ROCm 7.1+ installed at `/opt/rocm` +2. CTranslate2 built with ROCm support (see [paralin/ctranslate2-rocm](https://github.com/paralin/ctranslate2-rocm)) + +## Installation (Debian Sid) + +### 1. Clone and setup + +```bash +git clone https://github.com/paralin/whisperX-rocm.git ~/whisperx +cd ~/whisperx +git checkout rocm +``` + +### 2. Create venv with uv + +```bash +uv venv +uv pip install -e . +``` + +### 3. Install ROCm PyTorch + +```bash +uv pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm7.0 +``` + +### 4. Install ROCm CTranslate2 + +First build CTranslate2 with ROCm support (see [paralin/ctranslate2-rocm](https://github.com/paralin/ctranslate2-rocm)), then: + +```bash +# Remove PyPI ctranslate2 (has CUDA binaries) +rm -rf .venv/lib/python3.10/site-packages/ctranslate2* + +# Install ROCm build +export CTRANSLATE2_ROOT=/usr/local +export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH +uv pip install --reinstall pybind11 ~/ctranslate2/python +``` + +## Environment Variables + +These must be set before running whisperx: + +```bash +export HSA_OVERRIDE_GFX_VERSION=11.0.1 # for gfx1101 +export AMDGPU_TARGETS=gfx1101 +export ROCM_PATH=/opt/rocm +export HIP_VISIBLE_DEVICES=0 +export ROCR_VISIBLE_DEVICES=0 +export LD_LIBRARY_PATH=/usr/local/lib:/opt/rocm/lib:/opt/rocm/lib/llvm/lib:$LD_LIBRARY_PATH +``` + +## Usage + +```bash +# Set environment (add to ~/.bashrc for convenience) +export HSA_OVERRIDE_GFX_VERSION=11.0.1 +export ROCM_PATH=/opt/rocm +export HIP_VISIBLE_DEVICES=0 +export LD_LIBRARY_PATH=/usr/local/lib:/opt/rocm/lib:/opt/rocm/lib/llvm/lib:$LD_LIBRARY_PATH + +cd ~/whisperx +uv run whisperx audio.wav \ + --language en \ + --model large-v3 \ + --compute_type float16 \ + --device cuda \ + --batch_size 8 \ + --vad_method silero \ + --output_dir ./output \ + --output_format all +``` + +Note: We use `--device cuda` because ROCm's HIP layer translates CUDA API calls to AMD GPU. + +## Verify Installation + +```bash +# Check PyTorch sees the GPU +python -c "import torch; print(CUDA:, torch.cuda.is_available()); print(Device:, torch.cuda.get_device_name(0))" + +# Check CTranslate2 +python -c "import ctranslate2; print(ctranslate2.__version__); print(ctranslate2.get_supported_compute_types(cuda))" +``` + +Expected output: +``` +CUDA: True +Device: AMD Radeon RX 7700 XT + +4.6.2 +{int8_float16, int8_bfloat16, bfloat16, int8_float32, int8, float16, float32} +``` + +## GPU Architecture + +Set `HSA_OVERRIDE_GFX_VERSION` based on your GPU: + +| GPU | Architecture | HSA_OVERRIDE_GFX_VERSION | +|-----|--------------|--------------------------| +| RX 7900 XTX/XT | gfx1100 | 11.0.0 | +| RX 7800 XT | gfx1101 | 11.0.1 | +| RX 7700 XT | gfx1101 | 11.0.1 | +| RX 7600 | gfx1102 | 11.0.2 | +| RX 6900/6800/6700 | gfx1030 | 10.3.0 | +| RX 6600 | gfx1032 | 10.3.2 | + +## Upstream + +- Original: [m-bain/whisperX](https://github.com/m-bain/whisperX)