Skip to content

Commit 23e4529

Browse files
committed
Remove ElasticConfig from Service
Signed-off-by: kerthcet <[email protected]>
1 parent 590d58d commit 23e4529

File tree

4 files changed

+7
-27
lines changed

4 files changed

+7
-27
lines changed

api/core/v1alpha1/model_types.go

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -98,14 +98,13 @@ type FlavorName string
9898
type Flavor struct {
9999
// Name represents the flavor name, which will be used in model claim.
100100
Name FlavorName `json:"name"`
101-
// Requests defines the required accelerators to serve the model, like nvidia.com/gpu: 8.
102-
// When GPU number is greater than 8, like 32, then multi-host inference is enabled and
103-
// 32/8=4 hosts will be grouped as an unit, each host will have a resource request as
104-
// nvidia.com/gpu: 8. The may change in the future if the GPU number limit is broken.
105-
// Not recommended to set the cpu and memory usage here.
106-
// If using playground, you can define the cpu/mem usage at backendConfig.
107-
// If using service, you can define the cpu/mem at the container resources.
108-
// Note: if you define the same accelerator requests at playground/service as well,
101+
// Requests defines the required accelerators to serve the model for each replica,
102+
// like <nvidia.com/gpu: 8>. For multi-hosts cases, the requests here indicates
103+
// the resource requirements for each replica. This may change in the future.
104+
// Not recommended to set the cpu and memory usage here:
105+
// - if using playground, you can define the cpu/mem usage at backendConfig.
106+
// - if using inference service, you can define the cpu/mem at the container resources.
107+
// However, if you define the same accelerator requests at playground/service as well,
109108
// the requests here will be covered.
110109
// +optional
111110
Requests v1.ResourceList `json:"requests,omitempty"`

api/inference/v1alpha1/service_types.go

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,6 @@ type ServiceSpec struct {
3535
// LWS supports both single-host and multi-host scenarios, for single host
3636
// cases, only need to care about replicas, rolloutStrategy and workerTemplate.
3737
WorkloadTemplate lws.LeaderWorkerSetSpec `json:"workloadTemplate"`
38-
// ElasticConfig defines the configuration for elastic usage,
39-
// e.g. the max/min replicas. Default to 0 ~ Inf+.
40-
// This requires to install the HPA first or will not work.
41-
// +optional
42-
ElasticConfig *ElasticConfig `json:"elasticConfig,omitempty"`
4338
}
4439

4540
const (

api/inference/v1alpha1/zz_generated.deepcopy.go

Lines changed: 0 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

client-go/applyconfiguration/inference/v1alpha1/servicespec.go

Lines changed: 0 additions & 9 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)