cognitivecomputations/dolphin-2.9.3-mistral-nemo-12b-gguf · Default context setting seems borked

llama.cpp (built from git HEAD) thinks the Q8_0 quant has a context of 1024000, which doesn't match any of the numbers I see in the model card.

[1722691579] llama_model_loader: - kv  16:                       llama.context_length u32              = 1024000
[1722691579] llm_load_print_meta: n_ctx_train      = 1024000     
[1722691579] llm_load_print_meta: n_ctx_orig_yarn  = 1024000
[1722691579] llama_new_context_with_model: n_ctx      = 1024000

Using the -c 0 default setting of the command-line hung it and crashed my Macbook hard, probably because of memory allocation

[1722691589] llama_kv_cache_init:      Metal KV buffer size = 160000.00 MiB

-c 131072 seems to work fine. Was 131072 intended instead?