Inquiries regarding implementation details
#1
by
LemonNoel
- opened
Thanks for your excellent work! I attempted to execute the fine-tuning code in GLM, but I noticed a slight difference in the implementation of the gelu
function. Specifically, GLM uses the "approximate" strategy, whereas HuggingFace uses the default mode. I'm unsure which one I should use when fine-tuning the model.