S1 Base 671B fine tune ik llama GGUF request?
Hi Ubergarm,
Thanks again for your persistence in getting these quantized models out there!
I have been looking into the S1 Base 671B model, a fine tune of deepseek V3/R1, and wanted to know if you would have bandwidth to quantize it. I will try to quantize it myself, but I haven't really gone through the motions of quantizing this large of a model before and what it might take in terms of hardware/blood/sweat and tears. Anyway, apologies for the deepseek R1 only fans here but, S1 base in principle may be very good.
Oh hey I think someone requested me do this model on r/LocalLLaMA but when I went to reply to them it didn't show me the comment anymore which was really strange..
I believe you're talking about that model fine-tuned on scientific papers? Specifically this one?
https://huggingface.co/ScienceOne-AI/S1-Base-671B
Not sure I have the bandwidth to do a full set of those, but possibly a single one or something for whatever size would be good for people? But if you want it faster I suggest you create imatrix and quantize it yourself and upload to HF and tag it with ik_llama.cpp so folks can find it. Feel free to re-use my recipes or check out @Thireus work for recipe combinations as well.
I have a quant cookers basic guide here: https://github.com/ikawrakow/ik_llama.cpp/discussions/434 however it doesn't cover the fp8 safetensors to bf16 GGUF process which is described here: https://github.com/ikawrakow/ik_llama.cpp/discussions/434
Keep us posted if you make any progress or holler at me if u get stuck!