Skip to content

Commit 65f20f8

Browse files
Bump timm from 1.0.15 to 1.0.16 (#2390)
Bumps [timm](https://github.com/huggingface/pytorch-image-models) from 1.0.15 to 1.0.16. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/huggingface/pytorch-image-models/releases">timm's releases</a>.</em></p> <blockquote> <h2>Release v1.0.16</h2> <h2>June 26, 2025</h2> <ul> <li>MobileNetV5 backbone (w/ encoder only variant) for <a href="https://ai.google.dev/gemma/docs/gemma-3n#parameters">Gemma 3n</a> image encoder</li> <li>Version 1.0.16 released</li> </ul> <h2>June 23, 2025</h2> <ul> <li>Add F.grid_sample based 2D and factorized pos embed resize to NaFlexViT. Faster when lots of different sizes (based on example by <a href="https://github.com/stas-sl">https://github.com/stas-sl</a>).</li> <li>Further speed up patch embed resample by replacing vmap with matmul (based on snippet by <a href="https://github.com/stas-sl">https://github.com/stas-sl</a>).</li> <li>Add 3 initial native aspect NaFlexViT checkpoints created while testing, ImageNet-1k and 3 different pos embed configs w/ same hparams.</li> </ul> <table> <thead> <tr> <th align="left">Model</th> <th align="center">Top-1 Acc</th> <th align="center">Top-5 Acc</th> <th align="center">Params (M)</th> <th align="center">Eval Seq Len</th> </tr> </thead> <tbody> <tr> <td align="left"><a href="https://hf.co/timm/naflexvit_base_patch16_par_gap.e300_s576_in1k">naflexvit_base_patch16_par_gap.e300_s576_in1k</a></td> <td align="center">83.67</td> <td align="center">96.45</td> <td align="center">86.63</td> <td align="center">576</td> </tr> <tr> <td align="left"><a href="https://hf.co/timm/naflexvit_base_patch16_parfac_gap.e300_s576_in1k">naflexvit_base_patch16_parfac_gap.e300_s576_in1k</a></td> <td align="center">83.63</td> <td align="center">96.41</td> <td align="center">86.46</td> <td align="center">576</td> </tr> <tr> <td align="left"><a href="https://hf.co/timm/naflexvit_base_patch16_gap.e300_s576_in1k">naflexvit_base_patch16_gap.e300_s576_in1k</a></td> <td align="center">83.50</td> <td align="center">96.46</td> <td align="center">86.63</td> <td align="center">576</td> </tr> </tbody> </table> <ul> <li>Support gradient checkpointing for <code>forward_intermediates</code> and fix some checkpointing bugs. Thanks <a href="https://github.com/brianhou0208">https://github.com/brianhou0208</a></li> <li>Add 'corrected weight decay' (<a href="https://arxiv.org/abs/2506.02285">https://arxiv.org/abs/2506.02285</a>) as option to AdamW (legacy), Adopt, Kron, Adafactor (BV), Lamb, LaProp, Lion, NadamW, RmsPropTF, SGDW optimizers</li> <li>Switch PE (perception encoder) ViT models to use native timm weights instead of remapping on the fly</li> <li>Fix cuda stream bug in prefetch loader</li> </ul> <h2>June 5, 2025</h2> <ul> <li>Initial NaFlexVit model code. NaFlexVit is a Vision Transformer with: <ol> <li>Encapsulated embedding and position encoding in a single module</li> <li>Support for nn.Linear patch embedding on pre-patchified (dictionary) inputs</li> <li>Support for NaFlex variable aspect, variable resolution (SigLip-2: <a href="https://arxiv.org/abs/2502.14786">https://arxiv.org/abs/2502.14786</a>)</li> <li>Support for FlexiViT variable patch size (<a href="https://arxiv.org/abs/2212.08013">https://arxiv.org/abs/2212.08013</a>)</li> <li>Support for NaViT fractional/factorized position embedding (<a href="https://arxiv.org/abs/2307.06304">https://arxiv.org/abs/2307.06304</a>)</li> </ol> </li> <li>Existing vit models in <code>vision_transformer.py</code> can be loaded into the NaFlexVit model by adding the <code>use_naflex=True</code> flag to <code>create_model</code> <ul> <li>Some native weights coming soon</li> </ul> </li> <li>A full NaFlex data pipeline is available that allows training / fine-tuning / evaluating with variable aspect / size images <ul> <li>To enable in <code>train.py</code> and <code>validate.py</code> add the <code>--naflex-loader</code> arg, must be used with a NaFlexVit</li> </ul> </li> <li>To evaluate an existing (classic) ViT loaded in NaFlexVit model w/ NaFlex data pipe: <ul> <li><code>python validate.py /imagenet --amp -j 8 --model vit_base_patch16_224 --model-kwargs use_naflex=True --naflex-loader --naflex-max-seq-len 256</code></li> </ul> </li> <li>The training has some extra args features worth noting <ul> <li>The <code>--naflex-train-seq-lens'</code> argument specifies which sequence lengths to randomly pick from per batch during training</li> <li>The <code>--naflex-max-seq-len</code> argument sets the target sequence length for validation</li> <li>Adding <code>--model-kwargs enable_patch_interpolator=True --naflex-patch-sizes 12 16 24</code> will enable random patch size selection per-batch w/ interpolation</li> <li>The <code>--naflex-loss-scale</code> arg changes loss scaling mode per batch relative to the batch size, <code>timm</code> NaFlex loading changes the batch size for each seq len</li> </ul> </li> </ul> <h2>May 28, 2025</h2> <ul> <li>Add a number of small/fast models thanks to <a href="https://github.com/brianhou0208">https://github.com/brianhou0208</a> <ul> <li>SwiftFormer - <a href="https://github.com/Amshaker/SwiftFormer">(ICCV2023) SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications</a></li> <li>FasterNet - <a href="https://github.com/JierunChen/FasterNet">(CVPR2023) Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks</a></li> <li>SHViT - <a href="https://github.com/ysj9909/SHViT">(CVPR2024) SHViT: Single-Head Vision Transformer with Memory Efficient</a></li> <li>StarNet - <a href="https://github.com/ma-xu/Rewrite-the-Stars">(CVPR2024) Rewrite the Stars</a></li> <li>GhostNet-V3 <a href="https://github.com/huawei-noah/Efficient-AI-Backbones/tree/master/ghostnetv3_pytorch">GhostNetV3: Exploring the Training Strategies for Compact Models</a></li> </ul> </li> <li>Update EVA ViT (closest match) to support Perception Encoder models (<a href="https://arxiv.org/abs/2504.13181">https://arxiv.org/abs/2504.13181</a>) from Meta, loading Hub weights but I still need to push dedicated <code>timm</code> weights <ul> <li>Add some flexibility to ROPE impl</li> </ul> </li> <li>Big increase in number of models supporting <code>forward_intermediates()</code> and some additional fixes thanks to <a href="https://github.com/brianhou0208">https://github.com/brianhou0208</a> <ul> <li>DaViT, EdgeNeXt, EfficientFormerV2, EfficientViT(MIT), EfficientViT(MSRA), FocalNet, GCViT, HGNet /V2, InceptionNeXt, Inception-V4, MambaOut, MetaFormer, NesT, Next-ViT, PiT, PVT V2, RepGhostNet, RepViT, ResNetV2, ReXNet, TinyViT, TResNet, VoV</li> </ul> </li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/7101adb7ef562c9c2d63074a4e2098c15f5249e9"><code>7101adb</code></a> Update README.md</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/85b65f01299cb84561e8b7b038890186aa4556ce"><code>85b65f0</code></a> Update version for 1.0.16 release</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/1f69a52b7152f0585f8ea84f8b988ed35a682b74"><code>1f69a52</code></a> Merge pull request <a href="https://redirect.github.com/huggingface/pytorch-image-models/issues/2527">#2527</a> from huggingface/mobilenetv5</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/38286760112fa8ee6b6074c3c41e1e43ac033089"><code>3828676</code></a> Make RmsNormAct sync with RmsNorm re default eps of 1e-6</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/136440d9d410c1088978fb89b1fd04262f561980"><code>136440d</code></a> Switch to 'same' padding emulation for the enc model as it should be closer f...</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/ddd3f99a7855f3c583d3a303c6c46c5a6d34f8b2"><code>ddd3f99</code></a> Update test, encoder_only mode for backward test</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/4cc7fdbd8843c7aac1b07f0b2ea264906ab8451d"><code>4cc7fdb</code></a> Cleanup imports, mark MSFA as notrace</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/857727ded8ff27d8b27738a4d56431229c979316"><code>857727d</code></a> Simplify resolution check for improved script/trace compat</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/e0cb66913666f6c8d06fa61748a169fbd0a7a6fd"><code>e0cb669</code></a> Make features_only=True work with mnv5 &amp; enc, uses forward_intermediates()</li> <li><a href="https://github.com/huggingface/pytorch-image-models/commit/739b46cc65cd77568d78e19d836c577291f81e35"><code>739b46c</code></a> Fixed pool size (16,16) because of of MSFA.</li> <li>Additional commits viewable in <a href="https://github.com/huggingface/pytorch-image-models/compare/v1.0.15...v1.0.16">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=timm&package-manager=pip&previous-version=1.0.15&new-version=1.0.16)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
1 parent fa4711f commit 65f20f8

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

samples/export-requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ numpy<2.0.0; sys_platform == 'darwin'
77
einops==0.8.1 # For Qwen
88
transformers_stream_generator==0.0.5 # For Qwen
99
diffusers==0.34.0 # For image generation pipelines
10-
timm==1.0.15 # For exporting InternVL2
10+
timm==1.0.16 # For exporting InternVL2
1111
torchvision # For visual language models
1212
transformers>=4.43 # For Whisper
1313
hf_transfer # for faster models download, should used with env var HF_HUB_ENABLE_HF_TRANSFER=1

0 commit comments

Comments
 (0)