Skip to content

Conversation

clumsy
Copy link
Contributor

@clumsy clumsy commented Jun 30, 2025

Fixes #1085

Test plan:
[x] Unit tests added

[x] Tested locally, e.g.:

[named_resources]
my_device={"cpu": 100, "gpu": 0, "memMB": 819200, "devices": {"vpc.amazonaws.com/efa": 1}}

[component:utils.sh]
h=my_device


Running
torchx run -s local_cwd utils.sh cal

Gives:

torchx 2025-06-29 22:29:34 INFO     loaded configs from /private/tmp/dynamic_device/.torchxconfig
torchx 2025-06-29 22:29:35 INFO     Tracker configurations: {}
torchx 2025-06-29 22:29:35 INFO     Log directory not set in scheduler cfg. Creating a temporary log dir that will be deleted on exit. To preserve log directory set the `log_dir` cfg option
torchx 2025-06-29 22:29:35 INFO     Log directory is: /var/folders/8b/nbn0wcb93m710myrqx8r_clh0000gq/T/torchx_ue1_qcab
local_cwd://torchx/sh-mjpvn04c5kmnfc
torchx 2025-06-29 22:29:35 INFO     Waiting for the app to finish...
sh/0      June 2025
sh/0 Su Mo Tu We Th Fr Sa
sh/0  1  2  3  4  5  6  7
sh/0  8  9 10 11 12 13 14
sh/0 15 16 17 18 19 20 21
sh/0 22 23 24 25 26 27 28
sh/0 29 30
sh/0
torchx 2025-06-29 22:29:36 INFO     Job finished: SUCCEEDED

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 30, 2025
@clumsy
Copy link
Contributor Author

clumsy commented Jun 30, 2025

For your consideration @kiukchung , @d4l3k , @tonykao8080 , @andywag
If you're in favor of the feature - please feel free to suggest naming/formatting changes. Thanks!

@clumsy
Copy link
Contributor Author

clumsy commented Jul 10, 2025

Could you please have a look, @kiukchung?

@clumsy
Copy link
Contributor Author

clumsy commented Aug 22, 2025

Could you please have a look, @kiukchung , @d4l3k , @tonykao8080 , @andywag?
I think with this change we will need to support less named_resources in TorchX. E.g. no need to add just about all the new AWS ones.

@clumsy
Copy link
Contributor Author

clumsy commented Oct 3, 2025

Sorry for bugging, but could you please let me know what you think about this proposal, @kiukchung , @d4l3k , @tonykao8080 , @andywag ?

@d4l3k
Copy link
Member

d4l3k commented Oct 3, 2025

@clumsy are you on the PyTorch Slack? Better to ping there, we get a lot of GitHub notifications

@d4l3k d4l3k requested a review from kiukchung October 3, 2025 23:56
Copy link
Member

@d4l3k d4l3k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks like it needs to be rebased

@codecov-commenter
Copy link

codecov-commenter commented Oct 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.67%. Comparing base (1e3df20) to head (9a04d96).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1086      +/-   ##
==========================================
+ Coverage   91.60%   91.67%   +0.07%     
==========================================
  Files          83       84       +1     
  Lines        6431     6466      +35     
==========================================
+ Hits         5891     5928      +37     
+ Misses        540      538       -2     
Flag Coverage Δ
unittests 91.67% <100.00%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@facebook-github-bot
Copy link
Contributor

@d4l3k has imported this pull request. If you are a Meta employee, you can view this in D83964038.

@clumsy
Copy link
Contributor Author

clumsy commented Oct 6, 2025

Seem to be OK still, right @d4l3k ?

@clumsy clumsy force-pushed the feat/dynamic_resource branch from 9a04d96 to 60f2bc2 Compare October 6, 2025 19:33
@clumsy
Copy link
Contributor Author

clumsy commented Oct 6, 2025

Please check the latest changes, @kiukchung
Followed your recommendation and generalized the existing custom resource registration via a new env variable.
Verified it works and added a unit test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Config-based named resources
6 participants