Skip to content

Conversation

dvrogozh
Copy link
Contributor

@dvrogozh dvrogozh commented Sep 18, 2025

  • Drop apt-get loops as runners are now isolated
  • Add missing pciutils to accelerate workload
  • Switch to uv to install the stack (it's faster)
  • Grouping together pip install commands
  • Limit dependencies installation with the pinned date (with --exclude-newer)

test accelerate and transformers only
disable_build
disable_ut
disable_e2e
disable_distributed

@dvrogozh dvrogozh force-pushed the ci branch 2 times, most recently from b9d8b96 to 2aa85e8 Compare September 18, 2025 23:14
@dvrogozh dvrogozh changed the title ci: improve and simplify accelerate and transformers workloads [DO NOT MERGE] ci: improve and simplify accelerate and transformers workloads Sep 18, 2025
@mengfei25
Copy link
Contributor

@dvrogozh The permission issue is caused by that build used root which is why I want to clean up the workspace beforehand firstly in last my PR.
Why use root for build:

  1. It's the default user for pytorch CD image
  2. We need to install some deps but no sudo in the image

I have cleanup them manually, retrigger should be passed.
And I also submit a PR to cleanup them after each run #2088

@dvrogozh
Copy link
Contributor Author

And I also submit a PR to cleanup them after each run #2088

Thank you, @mengfei25. I'll rebase and keep an eye if any other issue will appear.

@dvrogozh dvrogozh force-pushed the ci branch 2 times, most recently from 7b19aa4 to 00e6ad4 Compare September 23, 2025 00:24
@dvrogozh dvrogozh changed the title [DO NOT MERGE] ci: improve and simplify accelerate and transformers workloads ci: improve and simplify accelerate and transformers workloads Sep 24, 2025
@dvrogozh dvrogozh force-pushed the ci branch 3 times, most recently from cbdea1e to 74f23a7 Compare September 26, 2025 15:55
* Fix torch package version checks
* Use `--dist loadfile` strategy in alignment with HF practice
* Group together `pip install` commands
* Add missing `pciutils` to accelerate workload
* Drop `apt-get` loops as runners are now isolated

test accelerate and transformers only
disable_build
disable_ut
disable_e2e
disable_distributed

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants