-
Notifications
You must be signed in to change notification settings - Fork 96
refactor(worker): use revision_exists
from huggingface_hub to check branch existence
#3203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Noting that the end-to-end tests are currently failing with a
This happens in the Since the failure is unrelated to the change (which only touches the revision existence check in |
maybe due to the fact that the PR branch is in another repository. cc @AndreaFrancis @lhoestq |
the original link from #2562 was pointing to a frozen commit, which I couldn't push to, and i guess where all this went wrong. So i tried to apply changes to |
I meant that possibly e2e tests are not running correctly on external PRs like yours. Nothing to do on your side for this point. I think the two other failing checks (on services/worker) are not affected by this CI bug, and can be fixed so that we can review the PR. |
I’ve fixed the Regarding the unit test failure in
|
Hi! any update about this issue/PR? |
Formatted util.py, now Also i can see that
It looks like the test expected a ValueError, but the current logic no longer raises it for the (config, split, 0, 100001, False, None) case. |
|
||
def create_branch(dataset: str, target_revision: str, hf_api: HfApi, committer_hf_api: HfApi) -> None: | ||
try: | ||
refs = retry(on=[requests.exceptions.ConnectionError], sleeps=LIST_REPO_REFS_RETRY_SLEEPS)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need LIST_REPO_REFS_RETRY_SLEEPS
in the code base? If not, we should remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need
LIST_REPO_REFS_RETRY_SLEEPS
in the code base? If not, we should remove it
I guess we do not need it, since the list_repo_refs
retry logic is no longer used after switching to revision_exists
, I’ve removed LIST_REPO_REFS_RETRY_SLEEPS
.
I'll let @lhoestq review the PR. I think it's good but I'm just wondering if we should keep the retry loop? I think not |
This PR refactors the
create_branch
function inservices/worker/src/worker/utils.py
to replace the manual revision existence check usinghf_api.list_repo_refs(...)
with the more robust and conciserevision_exists(...)
utility introduced inhuggingface_hub>=0.21.0
.Changes
list_repo_refs
call and related retry logic.revision_exists(dataset, target_revision)
.Benefits
huggingface_hub
.refs.converts
.Requirements
huggingface_hub
version must be >=0.21.0
.Tests
No behavioral change; existing tests covering
create_branch(...)
should continue to pass. If needed, additional tests can mockrevision_exists()
to verify branching behavior.Closes #2562