Commit Graph

99 Commits

Author SHA1 Message Date
c1fcb3f57c Add OpenAI-compatible API and Docker deployment
- Add FastAPI-based API in whisperx/api/
- Implement transcription endpoint compatible with OpenAI
- Added Dockerfile and docker-compose.yml for easy deployment
- Updated README with Docker instructions
- Added new script whisperx-serve for running the API
2026-05-13 01:37:47 +03:00
Tomáš Hnyk
027ec57aee doc: update cpu only example (#1164) 2025-10-09 09:34:54 +02:00
Adrian Wan
2663f2edb5 doc: fix diarize import in example script (#1192) 2025-10-09 09:27:07 +02:00
Jim Chen
95fecb91c8 build: upgrade PyTorch to 2.7.1 with CUDA 12.8 and multi-platform support
- feat: upgrade PyTorch to 2.7.1 and CUDA 12.8
    * Update README setup to require CUDA toolkit 12.8 instead of 12.4 (Linux and Windows)
    * Bump torch dependency from 2.6.0 to 2.7.1
    * Switch the PyTorch CUDA wheel index from cu124 to cu128


- Revert "docs: add troubleshooting section for libcudnn dependencies in README"
    * The issue of relying on two different versions of CUDNN in this project has been resolved.


- build(pyproject): relax python version and constrain package deps
    * Only download torch from PyTorch; obtain all other packages from PyPI.
    * Restrict numpy, onnxruntime, pandas to be compatible with Python 3.9


- build(pyproject): require triton 3.3.0+ for arm64 support
    * Add triton version 3.3.0 or newer to the dependencies to support arm64 architecture.

- build: skip Triton on Windows since it isn't supported
    * Add a platform marker to the triton dependency to skip it on Windows, as triton does not support Windows.

- build: configure PyTorch sources for cross-platform compatibility
    * macOS uses CPU-only PyTorch from pytorch-cpu index
    * Linux and Windows use CUDA 12.8 PyTorch from pytorch index
    * triton only installs on Linux with CUDA 12.8 support
    * Update lockfile to support multi-platform builds

- fix: restrict av to <16.0.0 for Python 3.9 compatibility
    * Add av<16.0.0 to dependencies to maintain Python 3.9 support
    * Update comment to include av in the restriction list
    * Update uv.lock accordingly

        PyAV dropped Python 3.9 support in v16.0.0:
        106089447c


- fix: resolve PyTorch ARM64 platform compatibility issue

    * Update uv.lock to properly handle aarch64 platforms for PyTorch dependencies
    * Add resolution markers for ARM64 Linux systems to use CPU-only PyTorch builds
    * Ensure CUDA builds are only used on x86_64 platforms where supported

    Resolves ARM64 Docker build failures by preventing uv from attempting to install CUDA PyTorch on unsupported platforms

- chore: change .python-version to 3.10

---

Signed-off-by: CHEN, CHUN <jim60105@gmail.com>
Signed-off-by: Jim Chen <Jim@ChenJ.im>
Co-authored-by: GitHub Copilot <bot@ChenJ.im>
2025-10-08 11:21:28 +02:00
Max Bain
ed13dc8c6c recall.ai sponsor
Added information about Recall.ai's Meeting Transcription API.
2025-10-03 00:12:53 +01:00
Kirill
d700b56c9c docs: add missing torch import to Python usage example in README 2025-06-08 03:34:49 -06:00
Barabazs
6fe0a8784a docs: add troubleshooting section for libcudnn dependencies in README 2025-05-31 05:20:06 -06:00
Barabazs
36d552cad3 fix: remove DiarizationPipeline from public API 2025-05-03 09:25:59 +02:00
Barabazs
a7564c2ad6 docs: update installation instructions 2025-03-25 17:02:41 +01:00
Barabazs
f8d11df727 docs: Update README example commands with generic audio path 2025-02-19 08:24:04 +01:00
Barabazs
70c639cdb5 doc: refer to DEFAULT_ALIGN_MODELS_HF for other langs 2025-01-17 08:47:44 +01:00
Markus Jochim
235536e28d Update links to language models in README 2025-01-17 08:47:44 +01:00
3manifold
79eb8fa53d Accept alternative VAD methods. Extend to use Silero VAD. 2025-01-06 13:41:46 +01:00
Barabazs
b08ad67a72 docs: update installation instructions in README 2025-01-02 08:35:45 +01:00
Barabazs
c18f9f979b fix: update README image source and enhance setup.py for long description 2025-01-02 08:30:04 +01:00
Jim O’Regan
9809336db6 Fix link in README.md 2024-12-16 08:04:59 +01:00
Max Bain
f2da2f858e Update README.md 2024-03-20 15:47:18 +00:00
Victor Thuillier
d8c9196346 Add Replicate large-v3 demo 2024-02-18 12:17:11 +01:00
Tobi John
9f23739f90 Update README.md
Demonstrates use of argument to save model to local path.
2023-12-15 13:46:32 +00:00
Sean Gillen
2b7ab95ad6 Update README to Correct Speaker Diarization Version Link
Currently errors if user just accepts terms for README link version
3.0. Version 3.1 introduced in pull request #586
2023-12-07 12:48:21 -08:00
M0HID
8a8eeb33ee Update README.md 2023-11-27 17:15:28 +00:00
kaihe-stori
acf31b754f update readme 2023-10-11 22:56:38 -04:00
Max Bain
c1b821a08d fix list markdown 2023-10-05 15:14:29 -07:00
Max Bain
78e20a16a8 update links 2023-10-05 15:14:03 -07:00
Max Bain
be07c13f75 read does actually work... 2023-10-05 14:48:39 -07:00
Valent Turkovic
84423ca517 Update README.md
Added info that Hugging Face token has to be write token because read token doesn't work.
2023-10-05 19:14:28 +02:00
Max Bain
3c7b03935b Merge pull request #430 from dotgrid/dotgrid-docs-patch
Document --compute_type command line option
2023-08-29 10:02:51 -06:00
Florian Kowarsch
81b12af321 adds link to whisperX medium on replicate and updates replicate bades in README.md 2023-08-21 08:16:46 +08:00
Paul F
c1197c490e Document --compute_type command line option 2023-08-19 08:19:49 +01:00
Dudu Asulin
7eb9692cb9 more 2023-08-02 10:32:02 +03:00
Dudu Asulin
8de0e2af51 make diarization faster 2023-08-02 10:11:43 +03:00
Max Bain
aa37509362 Merge branch 'main' into cuda-11.8 2023-07-25 00:28:53 +01:00
Max Bain
15b4c558c2 Merge pull request #352 from daanelson/replicate-demo
adding link to Replicate demo
2023-07-24 10:48:24 +01:00
Eric Baer
8673064658 Remove torchvision from README 2023-07-20 17:02:34 -07:00
dan nelson
512ab1acf9 adding Replicate demo 2023-06-30 18:22:10 -07:00
Max Bain
93ed6cfa93 interspeech 2023-06-01 16:54:16 +01:00
Tijs Zwinkels
63fb5fc46f Suggest using pytorch-cuda 11.8 instead of 11.7
This prevents CuFFT errors on newer cards such as the RTX 4090 and RTX 6000 Ada.

fixes #254
2023-05-16 12:07:09 +02:00
Max Bain
d8a2b4ffc9 Merge pull request #246 from m-bain/v3
V3
2023-05-13 12:18:09 +01:00
Max Bain
9ffb7e7a23 Merge branch 'v3' of https://github.com/m-bain/whisperX into v3
Conflicts:
	setup.py
2023-05-13 12:16:33 +01:00
Max Bain
fd8f1003cf add translate, fix word_timestamp error 2023-05-13 12:14:06 +01:00
Max Bain
5421f1d7ca remove v3 tag on pip install 2023-05-09 13:42:50 +01:00
Max Bain
2efa136114 update python usage example 2023-05-08 17:20:38 +01:00
Max Bain
0b839f3f01 Update README.md 2023-05-07 20:36:08 +01:00
Max Bain
7ad554c64f Merge branch 'main' into v3 2023-05-07 20:30:57 +01:00
Max Bain
4603f010a5 update readme, setup, add option to return char_timestamps 2023-05-07 20:28:33 +01:00
Max Bain
4e2ac4e4e9 torch2.0, remove compile for now, round to times to 3 decimal 2023-05-04 20:38:13 +01:00
Max Bain
b666523004 add v3 pre-release comment, and v4 progress update 2023-05-02 15:10:40 +01:00
sohaibanwaar
a693a779fa feat: adding the docker file 2023-05-02 13:28:20 +05:00
m-bain
cc7e168d2b add checkout command 2023-04-25 12:14:23 +01:00
m-bain
db97f29678 update pip install 2023-04-25 11:19:23 +01:00