-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Description
I can't get stdin
pipe to work correctly on windows.
Trying type timit1.wav | main -
does read the first 3898
bytes, but then it ends unexpectedly:
...
read_wav: read 3898 bytes from stdin
system_info: n_threads = 4 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | CUDA = 1 | COREML = 0 | OPENVINO = 0 |
main: processing '-' (1949 samples, 0.1 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...
...
The process tried to write to a nonexistent pipe.
There are also errors with with ffmpeg
(same command line and wave file works on Linux):
ffmpeg -i timit1.wav -f wav -ar 16000 - | main -
error: failed to open WAV file from stdin
error: failed to read WAV file '-'
...
av_interleaved_write_frame(): Broken pipe
[out#0/wav @ 0000026847b05f40] Error muxing a packet
[out#0/wav @ 0000026847b05f40] Error writing trailer: Broken pipe
size= 4kB time=00:00:01.28 bitrate= 26.1kbits/s speed=3.41x
video:0kB audio:8kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Conversion failed!
I've also experienced the same problem when piping to stdin via Node.js process. I can see that the pipe is closed too early, after only a part of the data is written.
Does succeed when given a a 1 second silent wave file
type empty.wav | main -
read_wav: read 35890 bytes from stdin
system_info: n_threads = 4 / 8 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | METAL = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | CUDA = 1 | COREML = 0 | OPENVINO = 0 |
main: processing '-' (17945 samples, 1.1 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...
[00:00:00.000 --> 00:00:10.000] [BLANK_AUDIO]
Possibly related to having all zero samples (not sure).
Similar behavior seen when run from Node.js.
Possible cause
I've located this code that may be related:
In common.cpp
-> read_wav()
:
if (fname == "-") {
{
uint8_t buf[1024];
while (true)
{
const size_t n = fread(buf, 1, sizeof(buf), stdin);
if (n == 0) {
break;
}
wav_data.insert(wav_data.end(), buf, buf + n);
}
}
if (drwav_init_memory(&wav, wav_data.data(), wav_data.size(), nullptr) == false) {
fprintf(stderr, "error: failed to open WAV file from stdin\n");
return false;
}
fprintf(stderr, "%s: read %zu bytes from stdin\n", __func__, wav_data.size());
}
Based on conversation with Microsoft Copilot chatbot, it suggested that it could be that on Windows, the pipe is interpreted as having a text mode by default. It suggests changing to binary mode.
Copilot's suggestion
Given the behavior you’re describing, it does seem like the issue could be related to the way stdin is being handled on Windows. The fact that whisper.cpp reads a certain amount of data and then stops without an error suggests that it might be interpreting a part of the binary data as an EOF marker due to Windows’ default text mode for stdin.
Switching stdin to binary mode is a common solution for this kind of problem on Windows because it prevents the system from misinterpreting binary data as control characters (like the Ctrl+Z EOF marker in text mode). Since you’ve confirmed that _setmode, O_BINARY, and fileno are not present in the codebase, adding the binary mode setting could potentially resolve the issue.
Here’s how you can modify the code to set stdin to binary mode:
#ifdef _WIN32
#include <fcntl.h>
#include <io.h>
#endif
// ...
if (fname == "-") {
#ifdef _WIN32
_setmode(_fileno(stdin), _O_BINARY);
#endif
// ... rest of the code ...
}
This may not be the actual cause of the issue, but it's still a possibility worth checking.