Skip to content

Conversation

happyandslow
Copy link
Collaborator

Pull Request Description

This PR tracks multi-modality use cases/demo deployment files and scripts

Related Issues

Resolves: #1509

@happyandslow happyandslow changed the title Multimodality Scenarios Multimodality Scenarios Sample Deployment Sep 19, 2025
Copy link
Contributor

Summary of Changes

Hello @happyandslow, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the multimodality capabilities by adding comprehensive Kubernetes deployment configurations for various advanced AI models. It aims to streamline the deployment and management of these models, enabling easier experimentation and production use of large language, vision, audio, and diffusion models within the existing infrastructure. The changes also include minor updates to image references for consistency and reliability.

Highlights

  • New Multimodality Model Deployments: Introduced Kubernetes deployment and service configurations for a variety of multimodality models, including several vLLM-based models (DSE-Qwen2-2B, Llama-3.1-8B-Instruct, LLaVA-1.5-7B, Qwen2-Audio-7B-Instruct, Qwen2.5-VL-7B-Instruct) and xDiT-based diffusion models (HunyuanDiT-v1.2-Diffusers, CogVideoX-2B).
  • Kubernetes Configuration Updates: Updated image registry paths for core services in kustomization.yaml to use aibrix-container-registry-cn-beijing.cr.volces.com and adjusted image tags for consistency.
  • Model Serving Frameworks: The new deployments leverage both vLLM for large language and vision models, and xDiT for diffusion models, showcasing diverse AI serving capabilities within the Kubernetes environment.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@happyandslow happyandslow marked this pull request as draft September 19, 2025 20:47
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a collection of Kubernetes deployment files for various multi-modality models. While the additions are valuable for demonstration purposes, the review has identified several recurring issues across the new YAML files that should be addressed to improve correctness, security, and maintainability. Key concerns include the use of hostPath volumes which poses security risks, inconsistencies in naming conventions, inefficient container startup practices like installing packages at runtime, and general YAML formatting issues. Addressing these points will make the samples more robust and better examples of production-ready configurations.

Comment on lines +107 to +110
- name: model-hostpath
hostPath:
path: /root/models
type: DirectoryOrCreate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using hostPath volumes introduces security risks and makes the deployment less portable, as it depends on a specific path (/root/models) existing on the node. It's highly recommended to use a PersistentVolumeClaim (PVC) for storing models. This decouples storage from the node and is a more standard and secure practice in Kubernetes.

Example of using a PVC:
First, create a PersistentVolumeClaim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: model-storage-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 50Gi # Adjust size as needed

Then, use it in your deployment:

      volumes:
        - name: model-hostpath
          persistentVolumeClaim:
            claimName: model-storage-pvc

@happyandslow happyandslow force-pushed the lexu/test-multimodality-scenarios branch from c4eb796 to afbb91e Compare September 19, 2025 20:51
@happyandslow happyandslow force-pushed the lexu/test-multimodality-scenarios branch from 9cb845b to c34d3eb Compare September 25, 2025 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC]: Supporting multi-modality models in AIBrix
2 participants