-
Notifications
You must be signed in to change notification settings - Fork 126
Description
Based on the masking implementation in the transformers
library, special tokens (e.g., [CLS], [SEP]) should be excluded from the masking process. However, upon reviewing the implementation in sequence_packer.py,
ModernBERT/src/sequence_packer.py
Line 284 in 8c57a0f
def mlm_masking( |
it appears that these tokens are currently being treated as valid masking candidates.
Could you please confirm if this behavior is intentional? If not, I suggest updating the masking logic to explicitly exclude special tokens. For instance, adding a condition to filter out these tokens before applying the mask would ensure consistency with the transformers
library's approach. Additionally, incorporating unit tests to verify that special tokens remain unmasked would improve code reliability.
Am I correct in my understanding, or is there something I might be missing?
Thank you for looking into this.