--- base_model: aixonlab/Valkyyrie-14b-v1 tags: - text-generation-inference - transformers - unsloth - llama - trl - llama-cpp - gguf-my-repo license: apache-2.0 language: - en --- # Triangle104/Valkyyrie-14b-v1-Q4_K_M-GGUF This model was converted to GGUF format from [`aixonlab/Valkyyrie-14b-v1`](https://huggingface.co/aixonlab/Valkyyrie-14b-v1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space. Refer to the [original model card](https://huggingface.co/aixonlab/Valkyyrie-14b-v1) for more details on the model. --- Model details: - Valkyyrie 14b v1 is a fine-tuned large language model based on Microsoft's Phi-4, further trained to have better conversation capabilities. Details 📊 Developed by: AIXON Lab Model type: Causal Language Model Language(s): English (primarily), may support other languages License: apache-2.0 Repository: https://huggingface.co/aixonlab/Valkyyrie-14b-v1 Model Architecture 🏗️ Base model: phi-4 Parameter count: ~14 billion Architecture specifics: Transformer-based language model Training & Fine-tuning 🔄 Valkyyrie-14b-v1 was fine-tuned to achieve - Better conversational skills Better creativity for writing and conversations. Broader knowledge across various topics Improved performance on specific tasks like writing, analysis, and problem-solving Better contextual understanding and response generation Intended Use 🎯 As an assistant or specific role bot. Ethical Considerations 🤔 As a fine-tuned model based on phi-4, this model may inherit biases and limitations from its parent model and the fine-tuning dataset. Users should be aware of potential biases in generated content and use the model responsibly. --- ## Use with llama.cpp Install llama.cpp through brew (works on Mac and Linux) ```bash brew install llama.cpp ``` Invoke the llama.cpp server or the CLI. ### CLI: ```bash llama-cli --hf-repo Triangle104/Valkyyrie-14b-v1-Q4_K_M-GGUF --hf-file valkyyrie-14b-v1-q4_k_m.gguf -p "The meaning to life and the universe is" ``` ### Server: ```bash llama-server --hf-repo Triangle104/Valkyyrie-14b-v1-Q4_K_M-GGUF --hf-file valkyyrie-14b-v1-q4_k_m.gguf -c 2048 ``` Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well. Step 1: Clone llama.cpp from GitHub. ``` git clone https://github.com/ggerganov/llama.cpp ``` Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux). ``` cd llama.cpp && LLAMA_CURL=1 make ``` Step 3: Run inference through the main binary. ``` ./llama-cli --hf-repo Triangle104/Valkyyrie-14b-v1-Q4_K_M-GGUF --hf-file valkyyrie-14b-v1-q4_k_m.gguf -p "The meaning to life and the universe is" ``` or ``` ./llama-server --hf-repo Triangle104/Valkyyrie-14b-v1-Q4_K_M-GGUF --hf-file valkyyrie-14b-v1-q4_k_m.gguf -c 2048 ```