MiniLlama-1.8b-Chat-v0.1-GGUF / README.md

Upload folder using huggingface_hub

25aa8c3 verified 5 days ago

6.83 kB

	---
	widget:
	- messages:
	- role: system
	content: You are a career counselor. The user will provide you with an individual
	looking for guidance in their professional life, and your task is to assist
	them in determining what careers they are most suited for based on their skills,
	interests, and experience. You should also conduct research into the various
	options available, explain the job market trends in different industries, and
	advice on which qualifications would be beneficial for pursuing particular fields.
	- role: user
	content: Hey friend!
	- role: assistant
	content: Hi! How may I help you?
	- role: user
	content: I am interested in developing a career in software engineering. What
	would you recommend me to do?
	- messages:
	- role: system
	content: You are a knowledgeable assistant. Help the user as much as you can.
	- role: user
	content: How to become smarter?
	- messages:
	- role: system
	content: You are a helpful assistant who provides concise responses.
	- role: user
	content: Hi!
	- role: assistant
	content: Hello there! How may I help you?
	- role: user
	content: I need to cook a simple dinner. What ingredients should I prepare for?
	- messages:
	- role: system
	content: You are a very creative assistant. User will give you a task, which you
	should complete with all your knowledge.
	- role: user
	content: Write the novel story of an RPG game about group of survivor post apocalyptic
	world.
	inference:
	parameters:
	max_new_tokens: 256
	temperature: 0.6
	top_p: 0.95
	top_k: 50
	repetition_penalty: 1.2
	base_model: frankenmerger/MiniLlama-1.8b-Chat-v0.1
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	datasets:
	- Locutusque/Hercules-v3.0
	- Locutusque/hyperion-v2.0
	- argilla/OpenHermes2.5-dpo-binarized-alpha
	tags:
	- TensorBlock
	- GGUF
	---

	<div style="width: auto; margin-left: auto; margin-right: auto">
	<img src="https://i.imgur.com/jC7kdl8.jpeg" alt="TensorBlock" style="width: 100%; min-width: 400px; display: block; margin: auto;">
	</div>
	<div style="display: flex; justify-content: space-between; width: 100%;">
	<div style="display: flex; flex-direction: column; align-items: flex-start;">
	<p style="margin-top: 0.5em; margin-bottom: 0em;">
	Feedback and support: TensorBlock's <a href="https://x.com/tensorblock_aoi">Twitter/X</a>, <a href="https://t.me/TensorBlock">Telegram Group</a> and <a href="https://x.com/tensorblock_aoi">Discord server</a>
	</p>
	</div>
	</div>

	## frankenmerger/MiniLlama-1.8b-Chat-v0.1 - GGUF

	This repo contains GGUF format model files for [frankenmerger/MiniLlama-1.8b-Chat-v0.1](https://huggingface.co/frankenmerger/MiniLlama-1.8b-Chat-v0.1).

	The files were quantized using machines provided by [TensorBlock](https://tensorblock.co/), and they are compatible with llama.cpp as of [commit b4242](https://github.com/ggerganov/llama.cpp/commit/a6744e43e80f4be6398fc7733a01642c846dce1d).

	<div style="text-align: left; margin: 20px 0;">
	<a href="https://tensorblock.co/waitlist/client" style="display: inline-block; padding: 10px 20px; background-color: #007bff; color: white; text-decoration: none; border-radius: 5px; font-weight: bold;">
	Run them on the TensorBlock client using your local machine ↗
	</a>
	</div>

	## Prompt template

	```
	<\|system\|>
	{system_prompt}</s>
	<\|user\|>
	{prompt}</s>
	<\|assistant\|>
	```

	## Model file specification

	\| Filename \| Quant type \| File Size \| Description \|
	\| -------- \| ---------- \| --------- \| ----------- \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q2_K.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q2_K.gguf) \| Q2_K \| 0.724 GB \| smallest, significant quality loss - not recommended for most purposes \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q3_K_S.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q3_K_S.gguf) \| Q3_K_S \| 0.840 GB \| very small, high quality loss \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q3_K_M.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q3_K_M.gguf) \| Q3_K_M \| 0.930 GB \| very small, high quality loss \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q3_K_L.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q3_K_L.gguf) \| Q3_K_L \| 1.008 GB \| small, substantial quality loss \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q4_0.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q4_0.gguf) \| Q4_0 \| 1.083 GB \| legacy; small, very high quality loss - prefer using Q3_K_M \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q4_K_S.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q4_K_S.gguf) \| Q4_K_S \| 1.090 GB \| small, greater quality loss \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q4_K_M.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q4_K_M.gguf) \| Q4_K_M \| 1.145 GB \| medium, balanced quality - recommended \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q5_0.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q5_0.gguf) \| Q5_0 \| 1.311 GB \| legacy; medium, balanced quality - prefer using Q4_K_M \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q5_K_S.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q5_K_S.gguf) \| Q5_K_S \| 1.311 GB \| large, low quality loss - recommended \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q5_K_M.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q5_K_M.gguf) \| Q5_K_M \| 1.343 GB \| large, very low quality loss - recommended \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q6_K.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q6_K.gguf) \| Q6_K \| 1.554 GB \| very large, extremely low quality loss \|
	\| [MiniLlama-1.8b-Chat-v0.1-Q8_0.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q8_0.gguf) \| Q8_0 \| 2.012 GB \| very large, extremely low quality loss - not recommended \|


	## Downloading instruction

	### Command line

	Firstly, install Huggingface Client

	```shell
	pip install -U "huggingface_hub[cli]"
	```

	Then, downoad the individual model file the a local directory

	```shell
	huggingface-cli download tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF --include "MiniLlama-1.8b-Chat-v0.1-Q2_K.gguf" --local-dir MY_LOCAL_DIR
	```

	If you wanna download multiple model files with a pattern (e.g., `Q4_Kgguf`), you can try:

	```shell
	huggingface-cli download tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF --local-dir MY_LOCAL_DIR --local-dir-use-symlinks False --include='Q4_Kgguf'
	```