Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions ggml-metal.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,11 @@ void ggml_metal_free(struct ggml_metal_context * ctx);
void * ggml_metal_host_malloc(size_t n);
void ggml_metal_host_free (void * data);

// helper to check if the device supports a specific family
// ideally, the user code should be doing these checks
// ref: https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf
bool ggml_metal_supports_family(struct ggml_metal_context * ctx, int family);

// set the number of command buffers to use
void ggml_metal_set_n_cb(struct ggml_metal_context * ctx, int n_cb);

Expand Down Expand Up @@ -100,6 +105,8 @@ GGML_API bool ggml_backend_is_metal(ggml_backend_t backend);

GGML_API void ggml_backend_metal_set_n_cb(ggml_backend_t backend, int n_cb);

GGML_API bool ggml_backend_metal_supports_family(ggml_backend_t backend, int family);

#ifdef __cplusplus
}
#endif
Expand Down
10 changes: 10 additions & 0 deletions ggml-metal.m
Original file line number Diff line number Diff line change
Expand Up @@ -459,6 +459,10 @@ void ggml_metal_host_free(void * data) {
free(data);
}

bool ggml_metal_supports_family(struct ggml_metal_context * ctx, int family) {
return [ctx->device supportsFamily:(MTLGPUFamilyApple1 + family - 1)];
}

void ggml_metal_set_n_cb(struct ggml_metal_context * ctx, int n_cb) {
ctx->n_cb = MIN(n_cb, GGML_METAL_MAX_BUFFERS);
}
Expand Down Expand Up @@ -1751,3 +1755,9 @@ void ggml_backend_metal_set_n_cb(ggml_backend_t backend, int n_cb) {

ggml_metal_set_n_cb(ctx, n_cb);
}

bool ggml_backend_metal_supports_family(ggml_backend_t backend, int family) {
struct ggml_metal_context * ctx = (struct ggml_metal_context *)backend->context;

return ggml_metal_supports_family(ctx, family);
}
5 changes: 5 additions & 0 deletions whisper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1078,6 +1078,11 @@ static ggml_backend_t whisper_backend_init(const whisper_context_params & params
if (!backend_gpu) {
WHISPER_LOG_ERROR("%s: ggml_backend_metal_init() failed\n", __func__);
}
if (!ggml_backend_metal_supports_family(backend_gpu, 7)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. It should disable it on family 7 and older.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? M2 Ultra is Apple8 and it works fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Older would be Apple 6, 5, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, yes.

I thought somebody said that Apple7 works?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, should be disabled on family 6 and older. I got confused since it's disabling 7 here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, let me know if you still think this needs to be changed.

The way it is written now it should do the following:

  • Apple 1-6 -> fallback to CPU
  • Apple 7, 8, 9, .. -> use GPU

The assumption is that if Apple7 is supported, then due to backwards compatibility all previous families (1-6) are also supported, so no need to check for those individually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order seems to be wrong, we should check the compatibility first, then call ggml_backend_metal_init. otherwise it still crash on lower device.

Copy link
Contributor

@Josscii Josscii Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do something like this:

#ifdef GGML_USE_METAL
    if (!ggml_backend_metal_supports_family(backend_gpu, 7)) {
        WHISPER_LOG_ERROR("%s: Metal GPU does not support family 7 - falling back to CPU\n", __func__);
    } else if (params.use_gpu) {
        WHISPER_LOG_INFO("%s: using Metal backend\n", __func__);
        ggml_metal_log_set_callback(whisper_log_callback_default, nullptr);
        backend_gpu = ggml_backend_metal_init();
        if (!backend_gpu) {
            WHISPER_LOG_ERROR("%s: ggml_backend_metal_init() failed\n", __func__);
        }
    }
#endif

but right now, it's not possible, because we can't get the ggml_backend_t if it is not inited, I think there need some refactor

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this done?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems done, right now it do check in ggml_backend_metal_init to avoid crash

WHISPER_LOG_ERROR("%s: Metal GPU does not support family 7 - falling back to CPU\n", __func__);
ggml_backend_free(backend_gpu);
backend_gpu = NULL;
}
}
#endif

Expand Down