Skip to content

Conversation

fujimotos
Copy link
Contributor

LibriSpeech is a widely-used benchmark dataset for training and
testing speech recognition models.

This adds a set of scripts to measure the recognition accuracy of
whisper.cpp models, following the common benchmark standards.

LibriSpeech is a widely-used benchmark dataset for training and
testing speech recognition models.

This adds a set of scripts to measure the recognition accuracy of
whisper.cpp models, following the common benchmark standards.

Signed-off-by: Fujimoto Seiji <[email protected]>
@@ -0,0 +1,25 @@
Code in this directory is adapted from OpenAI Whisper project
(https://github.com/openai/whisper) and carries the following
copyright and license.
Copy link
Contributor Author

@fujimotos fujimotos Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in LICENSE, the normalizer implementation in the
tests/normalizer/ subfolder was ported from the upstream.

  • We need this to get a comparable WER score. See this notebook
    about how OpenAI evaluate their speech recognition models.

  • The reason why I commited these files to this reposjitory is to minimize the
    dependencies we need to run the benchmark script.

pip install openai-whisper requires a full PyTorch libraries, so it's heavy.

WHISPER_FLAGS = --no-prints --threads 8 --language en --output-txt
```

Check out `eval.mk` for more details.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This README file describes how to perform the benchmark tests.

Confirmed to work on Ubuntu 24.04 and Amazon Linux 2023.

Copy link
Member

@danbev danbev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try this out locally and take a closer look as well soon.

Feedback from Daniel Bevenius.

This adds a short code example how to prepare the `whisper-cli`
command, to make the initial setup step a little bit clearer.

Signed-off-by: Fujimoto Seiji <[email protected]>
Based on a feedback from Georgi Gerganov.

Instead of setting up a virtual environment in Makefile, let users
set up the Python environment. This is better since users may have
their own preferred workflow/toolkit.

Signed-off-by: Fujimoto Seiji <[email protected]>
Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great!

There are 2 things that we can improve on:

  • The dataset seems to contain only relatively short speech segments. I think it would be good to have a dataset with a bit longer samples (i.e. a few minutes) in order to exercise the rolling window transcription that Whisper does

  • The current implementation loads and unloads the entire model for each sample. This is very inefficient. Instead, it should utilize the whisper-server to start it once and send all the samples via HTTP request. This will make the benchmark much faster.

For now we can merge and improve on these later.

@ggerganov ggerganov requested a review from danbev April 4, 2025 15:24
@ggerganov ggerganov merged commit 448f3d3 into ggml-org:master Apr 4, 2025
71 of 98 checks passed
@fujimotos fujimotos deleted the sf/librispeech branch April 5, 2025 00:39
@fujimotos
Copy link
Contributor Author

@ggerganov @danbev Thank you! I'm glad that it helps this project.

fujimotos added a commit to fujimotos/whisper.cpp that referenced this pull request Apr 20, 2025
…ml-org#2999)

* tests : add script to benchmark whisper.cpp on LibriSpeech corpus

LibriSpeech is a widely-used benchmark dataset for training and
testing speech recognition models.

This adds a set of scripts to measure the recognition accuracy of
whisper.cpp models, following the common benchmark standards.

Signed-off-by: Fujimoto Seiji <[email protected]>

* Document how to prepare `whisper-cli` and model files

Feedback from Daniel Bevenius.

This adds a short code example how to prepare the `whisper-cli`
command, to make the initial setup step a little bit clearer.

Signed-off-by: Fujimoto Seiji <[email protected]>

* tests : Simplify how to set up Python environment

Based on a feedback from Georgi Gerganov.

Instead of setting up a virtual environment in Makefile, let users
set up the Python environment. This is better since users may have
their own preferred workflow/toolkit.

Signed-off-by: Fujimoto Seiji <[email protected]>

---------

Signed-off-by: Fujimoto Seiji <[email protected]>
bygreencn added a commit to bygreencn/whisper.cpp that referenced this pull request Jun 29, 2025
* ggerganov/master: (25 commits)
  examples : add HEAPU8 to exported runtime methods (ggml-org#3062)
  ruby : make Ruby bindings installed with build options (ggml-org#3056)
  whisper : add no_context parameter to whisper_params (ggml-org#3045)
  examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (ggml-org#3038)
  ruby: use CMake in build process (ggml-org#3043)
  docs : update README.md to note newer nvidia gpus (ggml-org#3031)
  addon.node : support max_context api for addon.node (ggml-org#3025)
  whisper : reduce delta_min from 1000ms to 100ms (ggml-org#3028)
  docs : document how to use 'WHISPER_FFMPEG' build option (ggml-org#3029)
  docs : fix README.md (ggml-org#3024)
  xcf : use check for visionos build version (ggml-org#3021)
  ruby : fix types of arguments for rb_get_kwargs in ruby_whisper_params.c (ggml-org#3022)
  ruby : Update uri.rb (ggml-org#3016)
  models : fix dead link to models in readme (ggml-org#3006)
  ruby : change homepage URI in Ruby gemspec (ggml-org#3007)
  tests : add script to benchmark whisper.cpp on LibriSpeech corpus (ggml-org#2999)
  whisper : fix "bench-all outputs an invalid result on larger models" (ggml-org#3002)
  rename : ggerganov -> ggml-org (ggml-org#3005)
  examples : update server.py to match github pages app [no ci] (ggml-org#3004)
  whisper.wasm : fix unknown language issue (ggml-org#3000)
  ...
bygreencn added a commit to bygreencn/whisper.cpp that referenced this pull request Jun 29, 2025
* ggerganov/master: (25 commits)
  examples : add HEAPU8 to exported runtime methods (ggml-org#3062)
  ruby : make Ruby bindings installed with build options (ggml-org#3056)
  whisper : add no_context parameter to whisper_params (ggml-org#3045)
  examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (ggml-org#3038)
  ruby: use CMake in build process (ggml-org#3043)
  docs : update README.md to note newer nvidia gpus (ggml-org#3031)
  addon.node : support max_context api for addon.node (ggml-org#3025)
  whisper : reduce delta_min from 1000ms to 100ms (ggml-org#3028)
  docs : document how to use 'WHISPER_FFMPEG' build option (ggml-org#3029)
  docs : fix README.md (ggml-org#3024)
  xcf : use check for visionos build version (ggml-org#3021)
  ruby : fix types of arguments for rb_get_kwargs in ruby_whisper_params.c (ggml-org#3022)
  ruby : Update uri.rb (ggml-org#3016)
  models : fix dead link to models in readme (ggml-org#3006)
  ruby : change homepage URI in Ruby gemspec (ggml-org#3007)
  tests : add script to benchmark whisper.cpp on LibriSpeech corpus (ggml-org#2999)
  whisper : fix "bench-all outputs an invalid result on larger models" (ggml-org#3002)
  rename : ggerganov -> ggml-org (ggml-org#3005)
  examples : update server.py to match github pages app [no ci] (ggml-org#3004)
  whisper.wasm : fix unknown language issue (ggml-org#3000)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants