[Feature]: Fallback strategy for KV loading of KVConnectors.

### 🚀 The feature, motivation and pitch

When we use KV cache connectors like the LMCache, we load the needed cache from disk/remote/other servers. In the example of using LMCache, the scheduler first looks up the cache pool and then starts loading. But we don't have any try-catch mechanism for the loading process. There can be unexpected behavior in the loading process (power outage, bugs), and we do not have a fallback plan. Now it results in a silent error. In case of a disk load failuer, the vLLM will just continue its computation with the random data.

### Alternatives

_No response_

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Fallback strategy for KV loading of KVConnectors. #20063

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Fallback strategy for KV loading of KVConnectors. #20063

Description

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions