Multimodality Scenarios Sample Deployment #1584

happyandslow · 2025-09-19T20:46:31Z

Pull Request Description

This PR tracks multi-modality use cases/demo deployment files and scripts

Related Issues

Resolves: #1509

gemini-code-assist · 2025-09-19T20:46:55Z

Summary of Changes

Hello @happyandslow, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the multimodality capabilities by adding comprehensive Kubernetes deployment configurations for various advanced AI models. It aims to streamline the deployment and management of these models, enabling easier experimentation and production use of large language, vision, audio, and diffusion models within the existing infrastructure. The changes also include minor updates to image references for consistency and reliability.

Highlights

New Multimodality Model Deployments: Introduced Kubernetes deployment and service configurations for a variety of multimodality models, including several vLLM-based models (DSE-Qwen2-2B, Llama-3.1-8B-Instruct, LLaVA-1.5-7B, Qwen2-Audio-7B-Instruct, Qwen2.5-VL-7B-Instruct) and xDiT-based diffusion models (HunyuanDiT-v1.2-Diffusers, CogVideoX-2B).
Kubernetes Configuration Updates: Updated image registry paths for core services in kustomization.yaml to use aibrix-container-registry-cn-beijing.cr.volces.com and adjusted image tags for consistency.
Model Serving Frameworks: The new deployments leverage both vLLM for large language and vision models, and xDiT for diffusion models, showcasing diverse AI serving capabilities within the Kubernetes environment.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a collection of Kubernetes deployment files for various multi-modality models. While the additions are valuable for demonstration purposes, the review has identified several recurring issues across the new YAML files that should be addressed to improve correctness, security, and maintainability. Key concerns include the use of hostPath volumes which poses security risks, inconsistencies in naming conventions, inefficient container startup practices like installing packages at runtime, and general YAML formatting issues. Addressing these points will make the samples more robust and better examples of production-ready configurations.

samples/multimodality/xDiT/aibrix_vke_video.yaml

samples/multimodality/vllm/llama3-8b-test.yaml

samples/multimodality/xDiT/aibrix_kvcache_dev.yaml

gemini-code-assist · 2025-09-19T20:49:05Z

samples/multimodality/vllm/dse-qwen2-2b.yaml

+        - name: model-hostpath
+          hostPath:
+            path: /root/models
+            type: DirectoryOrCreate


Using hostPath volumes introduces security risks and makes the deployment less portable, as it depends on a specific path (/root/models) existing on the node. It's highly recommended to use a PersistentVolumeClaim (PVC) for storing models. This decouples storage from the node and is a more standard and secure practice in Kubernetes.

Example of using a PVC:
First, create a PersistentVolumeClaim:

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: model-storage-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 50Gi # Adjust size as needed

Then, use it in your deployment:

volumes: - name: model-hostpath persistentVolumeClaim: claimName: model-storage-pvc

samples/multimodality/xDiT/aibrix_vke.yaml

samples/multimodality/vllm/dse-qwen2-2b.yaml

samples/multimodality/xDiT/aibrix_vke_video.yaml

samples/multimodality/vllm/llama3-8b-test.yaml

Signed-off-by: Le Xu <[email protected]>

happyandslow changed the title ~~Multimodality Scenarios~~ Multimodality Scenarios Sample Deployment Sep 19, 2025

happyandslow marked this pull request as draft September 19, 2025 20:47

gemini-code-assist bot reviewed Sep 19, 2025

View reviewed changes

happyandslow force-pushed the lexu/test-multimodality-scenarios branch from c4eb796 to afbb91e Compare September 19, 2025 20:51

googs1025 reviewed Sep 23, 2025

View reviewed changes

samples/multimodality/vllm/llama3-8b-test.yaml Outdated Show resolved Hide resolved

Le Xu added 10 commits September 25, 2025 13:11

update multi modality deployment files

b0d9c18

Signed-off-by: Le Xu <[email protected]>

update deployment files

8d7052f

Signed-off-by: Le Xu <[email protected]>

record local deployment files

8853482

Signed-off-by: Le Xu <[email protected]>

update deployment script

c2d910b

Signed-off-by: Le Xu <[email protected]>

update new recursive donwload image runtime

6c25852

Signed-off-by: Le Xu <[email protected]>

video deployment file

b0382ea

Signed-off-by: Le Xu <[email protected]>

add video deployment files

17f6361

Signed-off-by: Le Xu <[email protected]>

refactor multi modality files location

771d868

Signed-off-by: Le Xu <[email protected]>

remove redundant data

1de9b4d

Signed-off-by: Le Xu <[email protected]>

clean up

c34d3eb

Signed-off-by: Le Xu <[email protected]>

happyandslow force-pushed the lexu/test-multimodality-scenarios branch from 9cb845b to c34d3eb Compare September 25, 2025 20:11

Le Xu added 12 commits September 25, 2025 15:30

update deployment file

a87f1ff

Signed-off-by: Le Xu <[email protected]>

update deployment file

ddd8e75

Signed-off-by: Le Xu <[email protected]>

update deployment file, xDiT patch

976f405

Signed-off-by: Le Xu <[email protected]>

remove irrelavent files

35cebd7

Signed-off-by: Le Xu <[email protected]>

update file structures

5ed1b0f

Signed-off-by: Le Xu <[email protected]>

remove files not in used

d84dfa1

Signed-off-by: Le Xu <[email protected]>

clean up

8c8c563

Signed-off-by: Le Xu <[email protected]>

clean up

de2918a

Signed-off-by: Le Xu <[email protected]>

update patch

c33c0a7

Signed-off-by: Le Xu <[email protected]>

update README with new patch

ad053f7

Signed-off-by: Le Xu <[email protected]>

update README

665c4db

Signed-off-by: Le Xu <[email protected]>

update video deployment instructions

7453eaf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multimodality Scenarios Sample Deployment #1584

Multimodality Scenarios Sample Deployment #1584

Uh oh!

happyandslow commented Sep 19, 2025

Uh oh!

gemini-code-assist bot commented Sep 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Multimodality Scenarios Sample Deployment #1584

Are you sure you want to change the base?

Multimodality Scenarios Sample Deployment #1584

Uh oh!

Conversation

happyandslow commented Sep 19, 2025

Pull Request Description

Related Issues

Uh oh!

gemini-code-assist bot commented Sep 19, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!