Getting error for tokenizer add_prefix_space = True

by Hitish9 - opened Jul 9, 2024

Jul 9, 2024

i am getting to change this variable in tokenizer
AssertionError: You need to instantiate LongformerTokenizerFast with add_prefix_space=True to use it with pretokenized inputs.
when i am using 4k model for inference.

SergheiDinu

Jul 19, 2024

same

Hitish9

Jul 22, 2024

Ihor Stepanov helped me with answer

You can use:

from gliner import GLiNER
model = GLiNER.from_pretrained("numind/NuNER_Zero-4k")
model.data_processor.transformer_tokenizer.add_prefix_space=True

SergheiDinu

Jul 22, 2024

What about the quality? I tried and see that it goes in wrong direction!

Hitish9

Jul 23, 2024

I have not tried it much. you can also increase context window size of NunerZeroshot by increase max_len value in model config

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment