Question regarding added tokens vs llama base
#7
by
vince62s
- opened
Hello,
I have some questions regarding the 7 added tokens.
Are embeddings learned at finetuning time or this just a "pre/post" processing usage ?
Also can you clarify the meaning of those:
Hey there,
The added tokens are there for flexibility if you want to fine-tune it for some specific use-case (e.g., MASK, CLS tokens). We only explicitly used at all times during the SFT the <|im_end|> (redefined as the eos token) and the <|im_start|> tokens.
They are learned at finetuning time.
nunonmg
changed discussion status to
closed