Inquiries regarding implementation details

#1
by LemonNoel - opened

Thanks for your excellent work! I attempted to execute the fine-tuning code in GLM, but I noticed a slight difference in the implementation of the gelu function. Specifically, GLM uses the "approximate" strategy, whereas HuggingFace uses the default mode. I'm unsure which one I should use when fine-tuning the model.

Sign up or log in to comment