Requests could be bigger than limits

**What happened**:

Once I set the Playground like:
```
apiVersion: inference.llmaz.io/v1alpha1
kind: Playground
metadata:
  name: llamacpp-speculator
spec:
  replicas: 1
  multiModelsClaim:
    modelNames:
      - name: llama2-7b-q8-gguf # the target model, should be the first one
        role: main
      - llama2-7b-q2-k-gguf  # the draft model
        role: draft
  backendConfig:
    name: llamacpp
    args:
      - -fa # use flash attention
    resources:
      requests:
        cpu: 4
        memory: "8Gi"

```

I could got a Service with 
```
spec:
  multiModelsClaim:
    inferenceMode: SpeculativeDecoding
    modelNames:
    - llama2-7b-q8-gguf
    - llama2-7b-q2-k-gguf
  workloadTemplate:
    leaderWorkerTemplate:
      restartPolicy: Default
      size: 1
      workerTemplate:
        metadata: {}
        spec:
          containers:
          - args:
            - -fa
            command:
            - ./llama-server
            image: ghcr.io/ggerganov/llama.cpp:server
            name: model-runner
            ports:
            - containerPort: 8080
              name: http
              protocol: TCP
            resources:
              limits:
                cpu: "2"
                memory: 4Gi
              requests:
                cpu: "4"
                memory: 8Gi
```

Requests are greater than limits, this is absolutely not allowed.

**What you expected to happen**:

**How to reproduce it (as minimally and precisely as possible)**:

**Anything else we need to know?**:

**Environment**:

- Kubernetes version (use `kubectl version`):
- LWS version:
- llmaz version (use `git describe --tags --dirty --always`):
- Cloud provider or hardware configuration:
- OS (e.g: `cat /etc/os-release`):
- Kernel (e.g. `uname -a`):
- Install tools:
- Others:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Requests could be bigger than limits #123

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Requests could be bigger than limits #123

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions