Skip to content

Conversation

pramodith
Copy link

What does this PR do?

Ensures that num_items_in_batch is passed to the compute_loss function in the prediction_step to ensure that loss is calculated the same way both at train and eval time.

Fixes #41108

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@SunMarc

@pramodith
Copy link
Author

I think the test case could be better open to hearing suggestions to add tests (for multi-gpu?) or modify the test.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Thanks for this nice PR !

@SunMarc
Copy link
Member

SunMarc commented Sep 29, 2025

Hmmm these tests are still failing: tests/trainer/test_trainer.py::TrainerIntegrationTest::test_evaluate_with_jit, can you quickly check why ? I guess the simplest solution would be to check if we have self.args.jit_mode_eval or not. Btw, we should probably deprecate this arg also

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@pramodith
Copy link
Author

pramodith commented Sep 29, 2025

Hmmm these tests are still failing: tests/trainer/test_trainer.py::TrainerIntegrationTest::test_evaluate_with_jit, can you quickly check why ? I guess the simplest solution would be to check if we have self.args.jit_mode_eval or not. Btw, we should probably deprecate this arg also

Will take a look in a few hours once I'm off work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

predict_step in Trainer should pass num_items_in_batch
3 participants