You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Like the slow tests, there are other environment variables available which are not enabled by default during testing:
344
+
343
345
-`RUN_CUSTOM_TOKENIZERS`: Enables tests for custom tokenizers.
344
346
345
347
More environment variables and additional information can be found in the [testing_utils.py](https://github.com/huggingface/transformers/blob/main/src/transformers/testing_utils.py).
Copy file name to clipboardExpand all lines: docs/source/en/attention_interface.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -193,4 +193,4 @@ def custom_attention_mask(
193
193
194
194
It mostly works thanks to the `mask_function`, which is a `Callable` in the form of [torch's mask_mod functions](https://pytorch.org/blog/flexattention/), taking 4 indices as input and returning a boolean to indicate if this position should take part in the attention computation.
195
195
196
-
If you cannot use the `mask_function` to create your mask for some reason, you can try to work around it by doing something similar to our [torch export workaround](https://github.com/huggingface/transformers/blob/main/src/transformers/integrations/executorch.py).
196
+
If you cannot use the `mask_function` to create your mask for some reason, you can try to work around it by doing something similar to our [torch export workaround](https://github.com/huggingface/transformers/blob/main/src/transformers/integrations/executorch.py).
Copy file name to clipboardExpand all lines: docs/source/en/auto_docstring.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -210,9 +210,9 @@ There are some rules for documenting different types of arguments and they're li
210
210
This can span multiple lines.
211
211
```
212
212
213
-
* Include `type`in backticks.
214
-
* Add *optional*if the argument isnot required or has a default value.
215
-
* Add "defaults to X"if it has a default value. You don't need to add "defaults to `None`" if the default value is `None`.
213
+
* Include `type`in backticks.
214
+
* Add *optional*if the argument isnot required or has a default value.
215
+
* Add "defaults to X"if it has a default value. You don't need to add "defaults to `None`" if the default value is `None`.
216
216
217
217
These arguments can also be passed to `@auto_docstring`as a `custom_args` argument. It is used to define the docstring block for new arguments once if they are repeated in multiple places in the modeling file.
Before the [`Cache`] class, the cache used to be stored as a tuple of tuples of tensors. This format is dynamic because it grows as text is generated, similar to [`DynamicCache`].
163
163
164
164
The legacy format is essentially the same data structure but organized differently.
165
+
165
166
- It's a tuple of tuples, where each inner tuple contains the key and value tensors for a layer.
166
167
- The tensors have the same shape `[batch_size, num_heads, seq_len, head_dim]`.
167
168
- The format is less flexible and doesn't support features like quantization or offloading.
Copy file name to clipboardExpand all lines: docs/source/en/chat_templating.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,9 +77,9 @@ Mistral-7B-Instruct uses `[INST]` and `[/INST]` tokens to indicate the start and
77
77
78
78
The input to `apply_chat_template` should be structured as a list of dictionaries with `role` and `content` keys. The `role` key specifies the speaker, and the `content` key contains the message. The common roles are:
79
79
80
-
-`user` for messages from the user
81
-
-`assistant` for messages from the model
82
-
-`system` for directives on how the model should act (usually placed at the beginning of the chat)
80
+
-`user` for messages from the user
81
+
-`assistant` for messages from the model
82
+
-`system` for directives on how the model should act (usually placed at the beginning of the chat)
83
83
84
84
[`apply_chat_template`] takes this list and returns a formatted sequence. Set `tokenize=True` if you want to tokenize the sequence.
Copy file name to clipboardExpand all lines: docs/source/en/cursor.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,6 +21,7 @@ where `port` is the port used by `transformers serve` (`8000` by default). On th
21
21
</h3>
22
22
23
23
You're now ready to set things up on the app side! In Cursor, while you can't set a new provider, you can change the endpoint for OpenAI requests in the model selection settings. First, navigate to "Settings" > "Cursor Settings", "Models" tab, and expand the "API Keys" collapsible. To set your `transformers serve` endpoint, follow this order:
24
+
24
25
1. Unselect ALL models in the list above (e.g. `gpt4`, ...);
25
26
2. Add and select the model you want to use (e.g. `Qwen/Qwen3-4B`)
26
27
3. Add some random text to OpenAI API Key. This field won't be used, but it can't be empty;
Follow the recommended practices below to ensure your custom generation method works as expected.
382
+
380
383
- Feel free to reuse the logic for validation and input preparation in the original [`~GenerationMixin.generate`].
381
384
- Pin the `transformers` version in the requirements if you use any private method/attribute in `model`.
382
385
- Consider adding model validation, input validation, or even a separate test file to help users sanity-check your code in their environment.
@@ -410,6 +413,7 @@ tags:
410
413
```
411
414
412
415
Recommended practices:
416
+
413
417
- Document input and output differences in [`~GenerationMixin.generate`].
414
418
- Add self-contained examples to enable quick experimentation.
415
419
- Describe soft-requirements such as if the method only works well with a certain family of models.
@@ -442,6 +446,7 @@ output = model.generate(
442
446
### Finding custom generation methods
443
447
444
448
You can find all custom generation methods by [searching for their custom tag.](https://huggingface.co/models?other=custom_generate), `custom_generate`. In addition to the tag, we curate two collections of `custom_generate` methods:
449
+
445
450
-[Custom generation methods - Community](https://huggingface.co/collections/transformers-community/custom-generation-methods-community-6888fb1da0efbc592d3a8ab6) -- a collection of powerful methods contributed by the community;
446
451
-[Custom generation methods - Tutorials](https://huggingface.co/collections/transformers-community/custom-generation-methods-tutorials-6823589657a94940ea02cfec) -- a collection of reference implementations for methods that previously were part of `transformers`, as well as tutorials for `custom_generate`.
Copy file name to clipboardExpand all lines: docs/source/en/glossary.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -185,9 +185,9 @@ See the [Fine-tune a pretrained model](https://huggingface.co/docs/transformers/
185
185
186
186
The model head refers to the last layer of a neural network that accepts the raw hidden states and projects them onto a different dimension. There is a different model head for each task. For example:
187
187
188
-
*[`GPT2ForSequenceClassification`] is a sequence classification head - a linear layer - on top of the base [`GPT2Model`].
189
-
*[`ViTForImageClassification`] is an image classification head - a linear layer on top of the final hidden state of the `CLS` token - on top of the base [`ViTModel`].
190
-
*[`Wav2Vec2ForCTC`] is a language modeling head with [CTC](#connectionist-temporal-classification-ctc) on top of the base [`Wav2Vec2Model`].
188
+
*[`GPT2ForSequenceClassification`] is a sequence classification head - a linear layer - on top of the base [`GPT2Model`].
189
+
*[`ViTForImageClassification`] is an image classification head - a linear layer on top of the final hidden state of the `CLS` token - on top of the base [`ViTModel`].
190
+
*[`Wav2Vec2ForCTC`] is a language modeling head with [CTC](#connectionist-temporal-classification-ctc) on top of the base [`Wav2Vec2Model`].
0 commit comments