This is not TRUE bitnet since it's trained with float AND THEN QUANTIZED TO {-1,0,1}

#3
by qmsoqm - opened

I believe the proper way to get bitnet is train the model with 1bit parameters from scratch.

Wonder how the model performed with this approach.
The size of model is so small (300mb).. it would be amazing if it worked well

Sign up or log in to comment