Script adjustment suggestion: use llama-gguf-split
#301
by
patf82
- opened
The current split files like "part1of3" aren't directly loadable by llama.cpp.
If the splits were created with the llama-gguf-split utility using the "-00001-of-00005.gguf" name convention (and splits between tensor boundaries) then llama.cpp could directly load the files as is.
I know adjusting scripts is always super annoying (for the dumbest reasons), but it'd be a nice touch of extra convenience.
Unfortunately not possible, see the FAQ (the model card), where this is addressed. Besides, llama.cpp could directly load the files as they are as well - it was a deliberate choice on their side to make a new file format that is incompatible to all existing split quants on hf.
mradermacher
changed discussion status to
closed