Text Generation
Transformers
PyTorch
code
gpt2
custom_code
Eval Results
text-generation-inference
Inference Endpoints

Question: MHA to MQA Conversion

#27
by kirazT - opened

First of all, thanks so much for introducing this project! The learnings are really interesting and I enjoy quite a lot reading the preprints and playing with the code.

A quick question on the MQA architecture: I understand that it is a great impl. for inference. However, is there a way to convert existing MHA checkpoints into MQA-based ones? Given that the cost of retraining models is quite high, it would be nice if there's a way to convert MHA model weights to MQA ones.

Thanks for helping! And looking forward to your thoughts on this.

Sign up or log in to comment