I believe the proper way to get bitnet is train the model with 1bit parameters from scratch.
Wonder how the model performed with this approach.The size of model is so small (300mb).. it would be amazing if it worked well
· Sign up or log in to comment