Re-Quantize Model

#1
by igoforth - opened

Hi, thank you for your work.

Would you be willing to update the model to support the latest QuiP# changes? I know you opened an issue here https://github.com/Cornell-RelaxML/quip-sharp/issues/31 . You and Minami-su are the only ones I've found who have made QuiP# quantizations so far.

It's funny the thought occured to me this morning too: I intuitively assumed I'd have to redo the hessians, which takes ages. Perhaps I only have to redo the latter two steps. That takes less than a day. I'll try that once my GPU is free again.

Thanks! Yeah I had checked that issue a few days ago and the dude mentioned not having to redo the hessians, so that's great news.

@igoforth I'm currently quantizing a 70b model, that takes longer than I thought. Maybe that will take 2 more days and I'll be busy with new year, too.

image.png

80 layers has been running for a day already

I finished llama 70b and uploading that now with the newest library version. I'm doing the same now with this 34b model. give it a day or so @igoforth

KnutJaegersberg changed discussion status to closed

Sign up or log in to comment