answerdotai
/

ModernBERT-base

Model card Files Files and versions Community

any

#26

by battleman0526 - opened 23 days ago

base: refs/heads/main

←

from: refs/pr/26

Discussion Files changed

Files changed (2) hide show

README.md +2 -2
tokenizer_config.json +1 -1

README.md CHANGED Viewed

@@ -44,10 +44,10 @@ For more information about ModernBERT, we recommend our [release blog post](http
 ## Usage
-You can use these models directly with the `transformers` library starting from v4.48.0:
 ```sh
-pip install -U transformers>=4.48.0
 ```
 Since ModernBERT is a Masked Language Model (MLM), you can use the `fill-mask` pipeline or load it via `AutoModelForMaskedLM`. To use ModernBERT for downstream tasks like classification, retrieval, or QA, fine-tune it following standard BERT fine-tuning recipes.

 ## Usage
+You can use these models directly with the `transformers` library. Until the next `transformers` release, doing so requires installing transformers from main:
 ```sh
+pip install git+https://github.com/huggingface/transformers.git
 ```
 Since ModernBERT is a Masked Language Model (MLM), you can use the `fill-mask` pipeline or load it via `AutoModelForMaskedLM`. To use ModernBERT for downstream tasks like classification, retrieval, or QA, fine-tune it following standard BERT fine-tuning recipes.

tokenizer_config.json CHANGED Viewed

@@ -932,7 +932,7 @@
   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
   "mask_token": "[MASK]",
-  "model_max_length": 8192,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "tokenizer_class": "PreTrainedTokenizerFast",

   "clean_up_tokenization_spaces": true,
   "cls_token": "[CLS]",
   "mask_token": "[MASK]",
+  "model_max_length": 1000000000000000019884624838656,
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "tokenizer_class": "PreTrainedTokenizerFast",