-
Notifications
You must be signed in to change notification settings - Fork 284
Removed 'slices' from EncodedImage #2258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed 'slices' from EncodedImage #2258
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR removes the slices
tensor and its size from the EncodedImage
struct, replacing them with a lightweight slices_shape
, and updates related functions to pass slices
and target_size
through as parameters rather than as part of the struct.
- Remove
ov::Tensor slices
andImageSize slices_size
fromEncodedImage
- Add
ov::Shape slices_shape
toEncodedImage
- Update
llava_image_embed_make_with_bytes_slice
,encode
, andresample_encoded_image
to forwardslices
andtarget_size
, and populateslices_shape
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
File | Description |
---|---|
src/cpp/src/visual_language/vision_encoder.hpp | Removed slices /slices_size , added slices_shape field |
src/cpp/src/visual_language/minicpm/classes.hpp | Updated resample_encoded_image signature to accept slices and target_sizes |
src/cpp/src/visual_language/minicpm/classes.cpp | Extended llava_image_embed_make_with_bytes_slice signature and body; updated encode and resample_encoded_image to use new parameters and set slices_shape ; updated downstream slice-logic to use slices_shape |
Comments suppressed due to low confidence (1)
src/cpp/src/visual_language/minicpm/classes.hpp:41
- [nitpick] Rename parameter
target_sizes
totarget_size
to match its singular usage and improve readability.
ResampledImage resample_encoded_image(const EncodedImage& image, const ov::Tensor slices, const ImageSize& target_sizes);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR removes the actual slice tensors from EncodedImage
and instead carries only their shape metadata, refactoring downstream resampling to receive slice data via a new helper struct.
- Deleted
ov::Tensor slices
andImageSize slices_size
, addedov::Shape slices_shape
inEncodedImage
. - Introduced
ImageSliceResult
to bundle slice tensor and target size, and updatedllava_image_embed_make_with_bytes_slice
to return it. - Changed
resample_encoded_image
signature and all call sites to accept explicit slice tensor and target size.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
src/cpp/src/visual_language/vision_encoder.hpp | Removed slices /slices_size , added slices_shape doc. |
src/cpp/src/visual_language/minicpm/classes.hpp | Updated resample_encoded_image declaration to take slices + size. |
src/cpp/src/visual_language/minicpm/classes.cpp | Added ImageSliceResult , refactored slicing and resample flows. |
Comments suppressed due to low confidence (2)
src/cpp/src/visual_language/minicpm/classes.hpp:29
- [nitpick] The parameter name
target_sizes
is plural but its type is a singleImageSize
; consider renaming it totarget_size
for clarity.
ResampledImage resample_encoded_image(const EncodedImage& image, const ov::Tensor& slices, const ImageSize& target_sizes);
src/cpp/src/visual_language/minicpm/classes.cpp:288
- No unit tests cover the new multi-slice code path or verify that
slices_shape
is populated and used correctly. Consider adding tests for both single- and multi-slice scenarios.
std::pair<EncodedImage, ImageSliceResult> llava_image_embed_make_with_bytes_slice(clip_ctx& ctx_clip, const ov::Tensor& img, ov::InferRequest& encoder, int max_slice_nums, int scale_resolution, size_t patch_size, bool never_split) {
ov::Tensor slices
can be removed fromEncodedImage
, as it is used during resampling, which is currently a part ofencode()
, so there's no need to keep slices inencode()
output.Tocket: CVS-167405