-
Notifications
You must be signed in to change notification settings - Fork 261
Add Docker Compose generator #1839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideThis PR introduces a new Docker Compose generator for Sequence diagram for generating docker-compose.yaml with --generate=composesequenceDiagram
actor User
participant Model
participant Compose
participant PlainFile
User->>Model: ramalama serve --generate=compose
Model->>Compose: compose(...)
Compose->>Compose: generate()
Compose->>PlainFile: genfile(name, content)
Compose-->>Model: PlainFile (docker-compose.yaml)
Model-->>User: docker-compose.yaml generated
Class diagram for the new Compose class and related changesclassDiagram
class Compose {
+__init__(model_name, model_paths, chat_template_paths, mmproj_paths, args, exec_args)
+generate() PlainFile
+_gen_volumes() str
+_gen_model_volume() str
+_gen_rag_volume() str
+_gen_chat_template_volume() str
+_gen_mmproj_volume() str
+_gen_devices() str
+_gen_ports() str
+_gen_environment() str
+_gen_gpu_deployment() str
+_gen_command() str
-src_model_path: str
-dest_model_path: str
-src_chat_template_path: str
-dest_chat_template_path: str
-src_mmproj_path: str
-dest_mmproj_path: str
-model_name: str
-name: str
-args
-exec_args
-image: str
}
class PlainFile {
+content: str
}
Compose --> PlainFile : generates
Compose --> "1" args : uses
Compose --> "1" exec_args : uses
Compose --> "1" model_name : uses
Compose --> "1" image : uses
Compose --> "1" PlainFile : returns
class Model {
+compose(model_paths, chat_template_paths, mmproj_paths, args, exec_args, output_dir)
}
Model --> Compose : creates
class genfile {
+genfile(name: str, content: str) PlainFile
}
Compose --> genfile : uses
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @abhibongale, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
The pull request introduces a new feature to the ramalama serve command, allowing users to generate docker-compose.yaml files. This new functionality, accessed via the --generate=compose option, automates the creation of Docker Compose configurations for deploying models, including handling various aspects like volume mounts, port mappings, environment variables, and GPU resource allocation.
Highlights
- Docker Compose Generation: Implements a new Compose class and integrates it into the ramalama serve command to generate docker-compose.yaml files.
- Flexible Volume Management: Supports automatic generation of Docker Compose volumes for models, chat templates, and multi-modal projection files, ensuring read-only access.
- RAG Source Integration: Enhances RAG data source handling by supporting both local file paths and OCI image references, correctly translating them into Docker Compose volume configurations.
- Configurable Port Mapping: Provides robust port configuration, allowing users to specify custom host-to-container port mappings or defaulting to 8080:8080 if no port is provided.
- Environment Variable Customization: Enables users to inject custom environment variables into the generated Docker Compose service, in addition to automatically detected acceleration environment variables.
- GPU Deployment Support: Automatically adds deploy section with nvidia GPU reservations to the Docker Compose file if the specified Docker image name contains GPU-related keywords (e.g., "cuda", "rocm", "gpu").
- Device Passthrough: Includes common device paths (/dev/dri, /dev/kfd, /dev/accel) for direct hardware access within the container, if these paths exist on the host.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a Docker Compose generator, which is a great addition for users who prefer docker-compose
for managing their services. The implementation is solid and covers many use cases, as demonstrated by the comprehensive test suite.
I've identified a couple of areas for improvement:
- The port parsing logic in
_gen_ports
can be made more robust to handle complex specifications including IP addresses. - The GPU deployment logic in
_gen_gpu_deployment
incorrectly generates NVIDIA-specific configurations for other GPU types like ROCm.
I've left specific suggestions on the relevant lines of code. Additionally, I've suggested adding a test case for ROCm images to prevent similar issues in the future.
Overall, this is a valuable feature, and with these minor adjustments, it will be even better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `ramalama/compose.py:116` </location>
<code_context>
+ def _gen_environment(self) -> str:
+ env_vars = get_accel_env_vars()
+ # Allow user to override with --env
+ if getattr(self.args, "env", None):
+ for e in self.args.env:
+ key, val = e.split("=", 1)
+ env_vars[key] = val
+
+ if not env_vars:
</code_context>
<issue_to_address>
Environment variable parsing may fail if '=' is missing in user input.
Currently, entries without '=' will cause a ValueError. Please add validation to handle such cases, either by skipping them or providing a clear error message.
</issue_to_address>
### Comment 2
<location> `ramalama/compose.py:47` </location>
<code_context>
+ volumes += self._gen_rag_volume()
+
+ # Chat Template Volume
+ if self.src_chat_template_path and os.path.exists(self.src_chat_template_path):
+ volumes += self._gen_chat_template_volume()
+
</code_context>
<issue_to_address>
os.path.exists checks may cause issues in environments where files are not present.
Relying on os.path.exists ties the Compose file generation to the current filesystem, which may cause issues if files are expected to exist later. Consider if this check should be moved or removed.
</issue_to_address>
### Comment 3
<location> `ramalama/compose.py:12` </location>
<code_context>
+from ramalama.version import version
+
+
+class Compose:
+ def __init__(
+ self,
</code_context>
<issue_to_address>
Consider replacing manual string construction with a dict-based approach and PyYAML serialization to simplify the code.
Here’s a sketch of how you can switch from hand-rolled f-strings to building a Python dict + PyYAML, which collapses most of the tiny helpers into a single `dict` builder. You’ll keep 100% of your logic but cut ~200 lines of string munging:
1) add `import yaml`
2) replace all `_gen_*()` + `f"""…"""` with something like:
```python
def _build_service(self) -> dict:
svc = {
"container_name": self.name,
"image": self.image,
"restart": "unless-stopped",
}
# volumes
vols = [f"{self.src_model_path}:{self.dest_model_path}:ro"]
if getattr(self.args, "rag", None):
rag = self.args.rag.removeprefix("oci://").removeprefix("oci:")
if self.args.rag.startswith("oci"):
vols.append({
"type": "image",
"source": rag,
"target": RAG_DIR,
"image": {"readonly": True},
})
elif os.path.exists(self.args.rag):
vols.append(f"{self.args.rag}:{RAG_DIR}:ro")
if self.src_chat_template_path and os.path.exists(self.src_chat_template_path):
vols.append(f"{self.src_chat_template_path}:{self.dest_chat_template_path}:ro")
if self.src_mmproj_path and os.path.exists(self.src_mmproj_path):
vols.append(f"{self.src_mmproj_path}:{self.dest_mmproj_path}:ro")
svc["volumes"] = vols
# ports
port = getattr(self.args, "port", "8080:8080")
host, _, cont = port.partition(":")
svc["ports"] = [f"{cont or host}:{host}"]
# environment
env = get_accel_env_vars()
for e in getattr(self.args, "env", []) or []:
k, v = e.split("=", 1)
env[k] = v
if env:
svc["environment"] = env
# devices
devs = [d for d in ("/dev/dri","/dev/kfd","/dev/accel") if os.path.exists(d)]
if devs:
svc["devices"] = [f"{d}:{d}" for d in devs]
# GPU deployment
if any(x in self.image.lower() for x in ("cuda","rocm","gpu")):
svc.setdefault("deploy", {}) \
.setdefault("resources", {}) \
.setdefault("reservations", {})["devices"] = [
{"driver":"nvidia","count":"all","capabilities":["gpu"]}
]
# command
if self.exec_args:
svc["command"] = self.exec_args
return svc
```
3) and then in `generate`:
```python
def generate(self) -> PlainFile:
compose = {
"version": "3.8",
"services": { self.model_name: self._build_service() }
}
content = yaml.safe_dump(compose, sort_keys=False)
file = PlainFile("docker-compose.yaml")
file.content = content
return file
```
This removes all the per-section string concatenation and line-filtering, while preserving your exact mounts, env, ports, devices and commands.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
LGTM, although I have no way of testing this with actual docker-compose. Will merge if tests pass. |
I would love to see this functionality exercised in the |
/ok-to-test |
Hey @mikebonnet and @rhatdan, I was figuring out why ci/macos job is failing consistently: So I ran the
|
I have couple of questions:
Thank you |
Please rebase your PR and resubmit git pull origin main I think the MAC issue was fixed in a different PR. |
This commit introduces the `--generate=compose` option to the `ramalama serve` command, enabling users to generate a `docker-compose.yaml` file for a given model. sourcery-ai suggested changes: 1. Add test_genfile_empty_content. This will verify that genfile handles empty input gracefully and avoids unexpected behavior. 2. Supporting condition to handle 'oci://' prefix for RAG source for consistency. 3. updating the parsing logic to support complex port formats, such as those i with IP addresses or protocols, to ensure compatibility with all valid Docker Compose specifications. 4. approach excludes images for other GPU backends like ROCm or custom builds. Please update the check to support a wider range of GPU-enabled images. fixes containers#184 Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com> Signed-off-by: abhibongale <[email protected]>
LGTM |
I probably should not have merged this, since I now see there is no way to use it as well as it is not documented within the man pages. ramalama serve --help And the man pages should document how to use it. |
@rhatdan I will create issue for docs @mikebonnet idea of having it in bats test and have to work on it in future. Thank you |
During containers#1839 review we decided that having bats test will be very helpful. This PR adds docker-compose bats tests. Also fixes the issue of generating compose file with static name. Fixes: containers#1873 Signed-off-by: Abhishek Bongale <[email protected]>
During containers#1839 review we decided that having bats test will be very helpful. This PR adds docker-compose bats tests. Also fixes the issue of generating compose file with static name. Fixes: containers#1873 Signed-off-by: Abhishek Bongale <[email protected]>
During containers#1839 review we decided that having bats test will be very helpful. This PR adds docker-compose bats tests. Also fixes the issue of generating compose file with static name. Fixes: containers#1873 Signed-off-by: Abhishek Bongale <[email protected]>
During containers#1839 review we decided that having bats test will be very helpful. This PR adds docker-compose bats tests. Also fixes the issue of generating compose file with static name. Fixes: containers#1873 Signed-off-by: Abhishek Bongale <[email protected]>
During containers#1839 review we decided that having bats test will be very helpful. This PR adds docker-compose bats tests. Also fixes the issue of generating compose file with static name. Fixes: containers#1873 Signed-off-by: Abhishek Bongale <[email protected]>
This commit introduces the
--generate=compose
option to theramalama serve
command, enabling usersto generate a
docker-compose.yaml
file for a given model.sourcery-ai suggested changes:
Add test_genfile_empty_content. This will verify that genfile handles empty input gracefully and avoids unexpected behavior.
Supporting condition to handle 'oci://' prefix for RAG source for consistency.
updating the parsing logic to support complex port formats, such as those i with IP addresses or protocols, to ensure compatibility with all valid Docker Compose specifications.
approach excludes images for other GPU backends like ROCm or custom builds. Please update the check to support a wider range of GPU-enabled images.
fixes #184
Summary by Sourcery
Introduce a Docker Compose generator for ramalama by adding a
--generate=compose
option toramalama serve
and implementing a Compose class that emits a complete docker-compose.yaml, with support for volumes, ports, environment variables, device mounts, GPU deployments, and commands.New Features:
--generate=compose
option toramalama serve
Bug Fixes:
Enhancements:
oci://
andoci:
prefixes for RAG source volumesTests: