Pipeline output skips spaces between words

#4
by chancar - opened

Hi there! I have been using this model and turns out that when I pipe it after fine-tuning, the utput ignores blank spaces and returns all words together, as in:

[{'entity_group': 'LABEL_0',
'score': 0.4824247,
'word': 'Thedogandthecatwenttothehouse',
'start': 0,
'end': 325}]

I have tried add_prefix_space=True in the tokenizer, but it does not seem to be working. Could someone give me a little push on this? Many thanks in advance.

chancar changed discussion status to closed
chancar changed discussion status to open
chancar changed discussion status to closed
chancar changed discussion status to open
chancar changed discussion status to closed

Sign up or log in to comment