S1 Base 671B fine tune ik llama GGUF request?

#18
by facedwithahug - opened

Hi Ubergarm,

Thanks again for your persistence in getting these quantized models out there!

I have been looking into the S1 Base 671B model, a fine tune of deepseek V3/R1, and wanted to know if you would have bandwidth to quantize it. I will try to quantize it myself, but I haven't really gone through the motions of quantizing this large of a model before and what it might take in terms of hardware/blood/sweat and tears. Anyway, apologies for the deepseek R1 only fans here but, S1 base in principle may be very good.

Oh hey I think someone requested me do this model on r/LocalLLaMA but when I went to reply to them it didn't show me the comment anymore which was really strange..

I believe you're talking about that model fine-tuned on scientific papers? Specifically this one?

https://huggingface.co/ScienceOne-AI/S1-Base-671B

Not sure I have the bandwidth to do a full set of those, but possibly a single one or something for whatever size would be good for people? But if you want it faster I suggest you create imatrix and quantize it yourself and upload to HF and tag it with ik_llama.cpp so folks can find it. Feel free to re-use my recipes or check out @Thireus work for recipe combinations as well.

I have a quant cookers basic guide here: https://github.com/ikawrakow/ik_llama.cpp/discussions/434 however it doesn't cover the fp8 safetensors to bf16 GGUF process which is described here: https://github.com/ikawrakow/ik_llama.cpp/discussions/434

Keep us posted if you make any progress or holler at me if u get stuck!

Sign up or log in to comment