Meaning of [unused0], [unused1] tokens etc
#73
by
henrikho
- opened
The input_id of 0 corresponds to the PAD Token but from input_id of 1 on to 99 there are unused tokens like in the Title. Why do exist? Seems like a waste of vocab size and also embedding matrix size and correspondingly memory