Skip to content

Conversation

abhibongale
Copy link
Contributor

@abhibongale abhibongale commented Aug 14, 2025

This commit introduces the --generate=compose option to the ramalama serve command, enabling users
to generate a docker-compose.yaml file for a given model.

sourcery-ai suggested changes:

  1. Add test_genfile_empty_content. This will verify that genfile handles empty input gracefully and avoids unexpected behavior.

  2. Supporting condition to handle 'oci://' prefix for RAG source for consistency.

  3. updating the parsing logic to support complex port formats, such as those i with IP addresses or protocols, to ensure compatibility with all valid Docker Compose specifications.

  4. approach excludes images for other GPU backends like ROCm or custom builds. Please update the check to support a wider range of GPU-enabled images.

fixes #184

Summary by Sourcery

Introduce a Docker Compose generator for ramalama by adding a --generate=compose option to ramalama serve and implementing a Compose class that emits a complete docker-compose.yaml, with support for volumes, ports, environment variables, device mounts, GPU deployments, and commands.

New Features:

  • Add --generate=compose option to ramalama serve
  • Implement Compose class to generate docker-compose.yaml output

Bug Fixes:

  • Ensure genfile handles empty content without errors

Enhancements:

  • Support both oci:// and oci: prefixes for RAG source volumes
  • Improve port parsing to accept host:container mappings and complex formats
  • Extend GPU image detection to recognize CUDA, ROCm, and generic GPU backends

Tests:

  • Add unit tests covering various Compose generation scenarios including ports, RAG sources, custom names, environment variables, devices, and empty content

Copy link
Contributor

sourcery-ai bot commented Aug 14, 2025

Reviewer's Guide

This PR introduces a new Docker Compose generator for ramalama serve by adding a --generate=compose option and implementing a Compose class to assemble and emit a complete docker-compose.yaml, with support for OCI prefixes, complex port formats, broad GPU detection, and comprehensive unit tests.

Sequence diagram for generating docker-compose.yaml with --generate=compose

sequenceDiagram
    actor User
    participant Model
    participant Compose
    participant PlainFile
    User->>Model: ramalama serve --generate=compose
    Model->>Compose: compose(...)
    Compose->>Compose: generate()
    Compose->>PlainFile: genfile(name, content)
    Compose-->>Model: PlainFile (docker-compose.yaml)
    Model-->>User: docker-compose.yaml generated
Loading

Class diagram for the new Compose class and related changes

classDiagram
    class Compose {
        +__init__(model_name, model_paths, chat_template_paths, mmproj_paths, args, exec_args)
        +generate() PlainFile
        +_gen_volumes() str
        +_gen_model_volume() str
        +_gen_rag_volume() str
        +_gen_chat_template_volume() str
        +_gen_mmproj_volume() str
        +_gen_devices() str
        +_gen_ports() str
        +_gen_environment() str
        +_gen_gpu_deployment() str
        +_gen_command() str
        -src_model_path: str
        -dest_model_path: str
        -src_chat_template_path: str
        -dest_chat_template_path: str
        -src_mmproj_path: str
        -dest_mmproj_path: str
        -model_name: str
        -name: str
        -args
        -exec_args
        -image: str
    }
    class PlainFile {
        +content: str
    }
    Compose --> PlainFile : generates
    Compose --> "1" args : uses
    Compose --> "1" exec_args : uses
    Compose --> "1" model_name : uses
    Compose --> "1" image : uses
    Compose --> "1" PlainFile : returns
    class Model {
        +compose(model_paths, chat_template_paths, mmproj_paths, args, exec_args, output_dir)
    }
    Model --> Compose : creates
    class genfile {
        +genfile(name: str, content: str) PlainFile
    }
    Compose --> genfile : uses
Loading

File-Level Changes

Change Details Files
Integrate "compose" generation into the serve command
  • Added a branch for gen_type=="compose" in generate_container_config
  • Implemented a compose() wrapper that invokes Compose.generate().write
  • Wired new compose path alongside existing container and kube flows
ramalama/model.py
Implement Compose class and genfile helper
  • Added a new Compose class with methods to build volumes, ports, environment, devices, GPU deployment, and command sections
  • Implemented generate() to assemble the YAML content and invoke genfile
  • Created genfile helper to print generation message and return a PlainFile
ramalama/compose.py
Enhance RAG source handling and port parsing
  • Strip both "oci://" and "oci:" prefixes when mounting image-based volumes
  • Support long-form image volume syntax for OCI sources
  • Parse port args with split limit to correctly map host and container ports
ramalama/compose.py
Broaden GPU image detection
  • Extended gpu_keywords to include 'cuda', 'rocm', and 'gpu'
  • Switched to case-insensitive matching against image name to trigger GPU deploy resources
ramalama/compose.py
Add comprehensive unit tests and fixtures
  • Created parametrized tests covering various compose scenarios (basic, ports, RAG, templates, mmproj, GPU, env vars, devices)
  • Added standalone genfile tests including empty content case
  • Included YAML fixture files under test/unit/data/test_compose for expected outputs
test/unit/test_compose.py
test/unit/data/test_compose/*

Assessment against linked issues

Issue Objective Addressed Explanation
#184 Implement the 'ramalama serve --generate compose MODEL' command to generate a docker-compose file for running an AI Model Service.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @abhibongale, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

The pull request introduces a new feature to the ramalama serve command, allowing users to generate docker-compose.yaml files. This new functionality, accessed via the --generate=compose option, automates the creation of Docker Compose configurations for deploying models, including handling various aspects like volume mounts, port mappings, environment variables, and GPU resource allocation.

Highlights

  • Docker Compose Generation: Implements a new Compose class and integrates it into the ramalama serve command to generate docker-compose.yaml files.
  • Flexible Volume Management: Supports automatic generation of Docker Compose volumes for models, chat templates, and multi-modal projection files, ensuring read-only access.
  • RAG Source Integration: Enhances RAG data source handling by supporting both local file paths and OCI image references, correctly translating them into Docker Compose volume configurations.
  • Configurable Port Mapping: Provides robust port configuration, allowing users to specify custom host-to-container port mappings or defaulting to 8080:8080 if no port is provided.
  • Environment Variable Customization: Enables users to inject custom environment variables into the generated Docker Compose service, in addition to automatically detected acceleration environment variables.
  • GPU Deployment Support: Automatically adds deploy section with nvidia GPU reservations to the Docker Compose file if the specified Docker image name contains GPU-related keywords (e.g., "cuda", "rocm", "gpu").
  • Device Passthrough: Includes common device paths (/dev/dri, /dev/kfd, /dev/accel) for direct hardware access within the container, if these paths exist on the host.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Docker Compose generator, which is a great addition for users who prefer docker-compose for managing their services. The implementation is solid and covers many use cases, as demonstrated by the comprehensive test suite.

I've identified a couple of areas for improvement:

  1. The port parsing logic in _gen_ports can be made more robust to handle complex specifications including IP addresses.
  2. The GPU deployment logic in _gen_gpu_deployment incorrectly generates NVIDIA-specific configurations for other GPU types like ROCm.

I've left specific suggestions on the relevant lines of code. Additionally, I've suggested adding a test case for ROCm images to prevent similar issues in the future.

Overall, this is a valuable feature, and with these minor adjustments, it will be even better.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `ramalama/compose.py:116` </location>
<code_context>
+    def _gen_environment(self) -> str:
+        env_vars = get_accel_env_vars()
+        # Allow user to override with --env
+        if getattr(self.args, "env", None):
+            for e in self.args.env:
+                key, val = e.split("=", 1)
+                env_vars[key] = val
+
+        if not env_vars:
</code_context>

<issue_to_address>
Environment variable parsing may fail if '=' is missing in user input.

Currently, entries without '=' will cause a ValueError. Please add validation to handle such cases, either by skipping them or providing a clear error message.
</issue_to_address>

### Comment 2
<location> `ramalama/compose.py:47` </location>
<code_context>
+            volumes += self._gen_rag_volume()
+
+        # Chat Template Volume
+        if self.src_chat_template_path and os.path.exists(self.src_chat_template_path):
+            volumes += self._gen_chat_template_volume()
+
</code_context>

<issue_to_address>
os.path.exists checks may cause issues in environments where files are not present.

Relying on os.path.exists ties the Compose file generation to the current filesystem, which may cause issues if files are expected to exist later. Consider if this check should be moved or removed.
</issue_to_address>

### Comment 3
<location> `ramalama/compose.py:12` </location>
<code_context>
+from ramalama.version import version
+
+
+class Compose:
+    def __init__(
+        self,
</code_context>

<issue_to_address>
Consider replacing manual string construction with a dict-based approach and PyYAML serialization to simplify the code.

Here’s a sketch of how you can switch from hand-rolled f-strings to building a Python dict + PyYAML, which collapses most of the tiny helpers into a single `dict` builder.  You’ll keep 100% of your logic but cut ~200 lines of string munging:

1) add  `import yaml`  
2) replace all `_gen_*()` + `f"""…"""` with something like:

```python
def _build_service(self) -> dict:
    svc = {
        "container_name": self.name,
        "image": self.image,
        "restart": "unless-stopped",
    }

    # volumes
    vols = [f"{self.src_model_path}:{self.dest_model_path}:ro"]
    if getattr(self.args, "rag", None):
        rag = self.args.rag.removeprefix("oci://").removeprefix("oci:")
        if self.args.rag.startswith("oci"):
            vols.append({
                "type": "image",
                "source": rag,
                "target": RAG_DIR,
                "image": {"readonly": True},
            })
        elif os.path.exists(self.args.rag):
            vols.append(f"{self.args.rag}:{RAG_DIR}:ro")
    if self.src_chat_template_path and os.path.exists(self.src_chat_template_path):
        vols.append(f"{self.src_chat_template_path}:{self.dest_chat_template_path}:ro")
    if self.src_mmproj_path and os.path.exists(self.src_mmproj_path):
        vols.append(f"{self.src_mmproj_path}:{self.dest_mmproj_path}:ro")
    svc["volumes"] = vols

    # ports
    port = getattr(self.args, "port", "8080:8080")
    host, _, cont = port.partition(":")
    svc["ports"] = [f"{cont or host}:{host}"]

    # environment
    env = get_accel_env_vars()
    for e in getattr(self.args, "env", []) or []:
        k, v = e.split("=", 1)
        env[k] = v
    if env:
        svc["environment"] = env

    # devices
    devs = [d for d in ("/dev/dri","/dev/kfd","/dev/accel") if os.path.exists(d)]
    if devs:
        svc["devices"] = [f"{d}:{d}" for d in devs]

    # GPU deployment
    if any(x in self.image.lower() for x in ("cuda","rocm","gpu")):
        svc.setdefault("deploy", {}) \
           .setdefault("resources", {}) \
           .setdefault("reservations", {})["devices"] = [
               {"driver":"nvidia","count":"all","capabilities":["gpu"]}
           ]

    # command
    if self.exec_args:
        svc["command"] = self.exec_args

    return svc
```

3) and then in `generate`:

```python
def generate(self) -> PlainFile:
    compose = {
        "version": "3.8",
        "services": { self.model_name: self._build_service() }
    }
    content = yaml.safe_dump(compose, sort_keys=False)
    file = PlainFile("docker-compose.yaml")
    file.content = content
    return file
```

This removes all the per-section string concatenation and line-filtering, while preserving your exact mounts, env, ports, devices and commands.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@rhatdan
Copy link
Member

rhatdan commented Aug 14, 2025

LGTM, although I have no way of testing this with actual docker-compose.

Will merge if tests pass.

@mikebonnet
Copy link
Collaborator

I would love to see this functionality exercised in the bats tests.

@mikebonnet
Copy link
Collaborator

/ok-to-test

@abhibongale
Copy link
Contributor Author

Hey @mikebonnet and @rhatdan,

I was figuring out why ci/macos job is failing consistently:

So I ran the bats test/system/050-pull.bats on locally on my fedora and following is the output:

✓ [050] ramalama pull no model
 ✗ [050] ramalama pull ollama
   tags: distro-integration
   (from function `run_ramalama' in file test/system/helpers.bash, line 179,
    in test file test/system/050-pull.bats, line 15)
     `run_ramalama pull tiny' failed
   
   [14:43:00.566565060] $ ramalama pull tiny
   [14:53:00.575528423] Downloading ollama://library/tinyllama:latest ...
   Trying to pull ollama://library/tinyllama:latest ...
   timeout: sending signal TERM to command ‘ramalama’
   [14:53:00.578643521] [ rc=124 (** EXPECTED 0 **) ]
   *** TIMED OUT ***
   # [teardown]
 - [050] ramalama pull ollama cache (skipped: Not supported without ollama)
 ✓ [050] ramalama pull huggingface
 ✓ [050] ramalama pull huggingface tag multiple references
 ✓ [050] ramalama pull huggingface-cli cache
 - [050] ramalama pull oci (skipped: Waiting for podman artiface support)
 ✓ [050] ramalama URL
 ✓ [050] ramalama file URL
 ✓ [050] ramalama use registry

10 tests, 1 failure, 2 skipped

@abhibongale
Copy link
Contributor Author

abhibongale commented Aug 18, 2025

I have couple of questions:

  • Do you guys think it's problem with ramalama pull logic? should I try to test/system/050-pull.bats or it's helper file to add retry logic?
  • Am I looking at right things, or should I check my changes again (I mean this Add Docker Compose generator?

Thank you

@rhatdan
Copy link
Member

rhatdan commented Aug 18, 2025

Please rebase your PR and resubmit

git pull origin main
git rebase -i origin
git push --force

I think the MAC issue was fixed in a different PR.

This commit introduces the `--generate=compose` option to the
`ramalama serve` command, enabling users
to generate a `docker-compose.yaml` file for a given model.

sourcery-ai suggested changes:
1. Add test_genfile_empty_content.
This will verify that genfile handles empty input gracefully and avoids
unexpected behavior.

2. Supporting condition to handle  'oci://' prefix for RAG source
for consistency.

3. updating the parsing logic to support complex port formats, such as those i
with IP addresses or protocols, to ensure compatibility
with all valid Docker Compose specifications.

4. approach excludes images for other GPU backends like ROCm or custom builds.
Please update the check to support a wider range of GPU-enabled images.

fixes containers#184

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: abhibongale <[email protected]>
@rhatdan
Copy link
Member

rhatdan commented Aug 26, 2025

LGTM

@rhatdan rhatdan merged commit d677908 into containers:main Aug 26, 2025
56 of 61 checks passed
@rhatdan
Copy link
Member

rhatdan commented Aug 26, 2025

I probably should not have merged this, since I now see there is no way to use it as well as it is not documented within the man pages.

ramalama serve --help
Should reference compose as being supported

And the man pages should document how to use it.

@abhibongale
Copy link
Contributor Author

@rhatdan I will create issue for docs @mikebonnet idea of having it in bats test and have to work on it in future. Thank you

abhibongale added a commit to abhibongale/ramalama that referenced this pull request Sep 16, 2025
During containers#1839 review we decided that having bats test will be very
helpful. This PR adds docker-compose bats tests.

Also fixes the issue of generating compose file with static name.

Fixes: containers#1873
Signed-off-by: Abhishek Bongale <[email protected]>
abhibongale added a commit to abhibongale/ramalama that referenced this pull request Sep 17, 2025
During containers#1839 review we decided that having bats test will be very
helpful. This PR adds docker-compose bats tests.

Also fixes the issue of generating compose file with static name.

Fixes: containers#1873
Signed-off-by: Abhishek Bongale <[email protected]>
abhibongale added a commit to abhibongale/ramalama that referenced this pull request Sep 17, 2025
During containers#1839 review we decided that having bats test will be very
helpful. This PR adds docker-compose bats tests.

Also fixes the issue of generating compose file with static name.

Fixes: containers#1873
Signed-off-by: Abhishek Bongale <[email protected]>
abhibongale added a commit to abhibongale/ramalama that referenced this pull request Sep 17, 2025
During containers#1839 review we decided that having bats test will be very
helpful. This PR adds docker-compose bats tests.

Also fixes the issue of generating compose file with static name.

Fixes: containers#1873
Signed-off-by: Abhishek Bongale <[email protected]>
ieaves pushed a commit to ramalama-labs/ramalama that referenced this pull request Sep 18, 2025
During containers#1839 review we decided that having bats test will be very
helpful. This PR adds docker-compose bats tests.

Also fixes the issue of generating compose file with static name.

Fixes: containers#1873
Signed-off-by: Abhishek Bongale <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ramalama serve --generate compose MODEL which would generate a docker-compose file for running AI Model Service.
3 participants