I'm now working on finetuning of coding models. If you are GPU-hungry like me, you will find quantized models very helpful. But quantization for finetuning and inference are different and incompatible. So I made two collections here.
For quantization, the inference models are far more popular on HF than finetuning models. I use https://huggingface.co/QuantFactory to generate inference models (GGUF), and there are a few other choices.
But there hasn't been such a service for finetuning models. DIY isn't too hard though. I made a few myself and you can find the script in the model cards. If the original model is small enough, you can even do it on a free T4 (available via Google Colab).
If you know a (small) coding model worthy of quantization, please let me know and I'd love to add it to the collections.
Very few people realize that most of the successful AI startups got successful because they were focused on open science and open-source for at least their first few years. To name but a few, OpenAI (GPT, GPT2 was open-source), Runway & Stability (stable diffusion), Cohere, Mistral and of course Hugging Face!
The reasons are not just altruistic, it's also because sharing your science and your models pushes you to build AI faster (which is key in a fast-moving domain like AI), attracts the best scientists & engineers and generates much more visibility, usage and community contributions than if you were 100% closed-source. The same applies to big tech companies as we're seeing with Meta and Google!
More startups and companies should release research & open-source AI, it's not just good for the world but also increases their probability of success!
π¦ Is your SQL a bit rusty? I just created theText To SQL Hub dataset explorer. To write SQL queries based on natural text input. Uses DuckDB, Llama 3.1 70B and the Hugging Face dataset-server API.
We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.
Major improvements:
1. GPU cold starts about twice as fast! 2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community! 3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True)) 4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!
Feel free to answer in the post if you have any questions