Text Generation
GGUF
English
TensorBlock
GGUF
Inference Endpoints
conversational
File size: 6,832 Bytes
25aa8c3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
widget:
- messages:
  - role: system
    content: You are a career counselor. The user will provide you with an individual
      looking for guidance in their professional life, and your task is to assist
      them in determining what careers they are most suited for based on their skills,
      interests, and experience. You should also conduct research into the various
      options available, explain the job market trends in different industries, and
      advice on which qualifications would be beneficial for pursuing particular fields.
  - role: user
    content: Hey friend!
  - role: assistant
    content: Hi! How may I help you?
  - role: user
    content: I am interested in developing a career in software engineering. What
      would you recommend me to do?
- messages:
  - role: system
    content: You are a knowledgeable assistant. Help the user as much as you can.
  - role: user
    content: How to become smarter?
- messages:
  - role: system
    content: You are a helpful assistant who provides concise responses.
  - role: user
    content: Hi!
  - role: assistant
    content: Hello there! How may I help you?
  - role: user
    content: I need to cook a simple dinner. What ingredients should I prepare for?
- messages:
  - role: system
    content: You are a very creative assistant. User will give you a task, which you
      should complete with all your knowledge.
  - role: user
    content: Write the novel story of an RPG game about group of survivor post apocalyptic
      world.
inference:
  parameters:
    max_new_tokens: 256
    temperature: 0.6
    top_p: 0.95
    top_k: 50
    repetition_penalty: 1.2
base_model: frankenmerger/MiniLlama-1.8b-Chat-v0.1
license: apache-2.0
language:
- en
pipeline_tag: text-generation
datasets:
- Locutusque/Hercules-v3.0
- Locutusque/hyperion-v2.0
- argilla/OpenHermes2.5-dpo-binarized-alpha
tags:
- TensorBlock
- GGUF
---

<div style="width: auto; margin-left: auto; margin-right: auto">
<img src="https://i.imgur.com/jC7kdl8.jpeg" alt="TensorBlock" style="width: 100%; min-width: 400px; display: block; margin: auto;">
</div>
<div style="display: flex; justify-content: space-between; width: 100%;">
    <div style="display: flex; flex-direction: column; align-items: flex-start;">
        <p style="margin-top: 0.5em; margin-bottom: 0em;">
            Feedback and support: TensorBlock's  <a href="https://x.com/tensorblock_aoi">Twitter/X</a>, <a href="https://t.me/TensorBlock">Telegram Group</a> and <a href="https://x.com/tensorblock_aoi">Discord server</a>
        </p>
    </div>
</div>

## frankenmerger/MiniLlama-1.8b-Chat-v0.1 - GGUF

This repo contains GGUF format model files for [frankenmerger/MiniLlama-1.8b-Chat-v0.1](https://huggingface.co/frankenmerger/MiniLlama-1.8b-Chat-v0.1).

The files were quantized using machines provided by [TensorBlock](https://tensorblock.co/), and they are compatible with llama.cpp as of [commit b4242](https://github.com/ggerganov/llama.cpp/commit/a6744e43e80f4be6398fc7733a01642c846dce1d).

<div style="text-align: left; margin: 20px 0;">
    <a href="https://tensorblock.co/waitlist/client" style="display: inline-block; padding: 10px 20px; background-color: #007bff; color: white; text-decoration: none; border-radius: 5px; font-weight: bold;">
        Run them on the TensorBlock client using your local machine ↗
    </a>
</div>

## Prompt template

```
<|system|>
{system_prompt}</s>
<|user|>
{prompt}</s>
<|assistant|>
```

## Model file specification

| Filename | Quant type | File Size | Description |
| -------- | ---------- | --------- | ----------- |
| [MiniLlama-1.8b-Chat-v0.1-Q2_K.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q2_K.gguf) | Q2_K | 0.724 GB | smallest, significant quality loss - not recommended for most purposes |
| [MiniLlama-1.8b-Chat-v0.1-Q3_K_S.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q3_K_S.gguf) | Q3_K_S | 0.840 GB | very small, high quality loss |
| [MiniLlama-1.8b-Chat-v0.1-Q3_K_M.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q3_K_M.gguf) | Q3_K_M | 0.930 GB | very small, high quality loss |
| [MiniLlama-1.8b-Chat-v0.1-Q3_K_L.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q3_K_L.gguf) | Q3_K_L | 1.008 GB | small, substantial quality loss |
| [MiniLlama-1.8b-Chat-v0.1-Q4_0.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q4_0.gguf) | Q4_0 | 1.083 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
| [MiniLlama-1.8b-Chat-v0.1-Q4_K_S.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q4_K_S.gguf) | Q4_K_S | 1.090 GB | small, greater quality loss |
| [MiniLlama-1.8b-Chat-v0.1-Q4_K_M.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q4_K_M.gguf) | Q4_K_M | 1.145 GB | medium, balanced quality - recommended |
| [MiniLlama-1.8b-Chat-v0.1-Q5_0.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q5_0.gguf) | Q5_0 | 1.311 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
| [MiniLlama-1.8b-Chat-v0.1-Q5_K_S.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q5_K_S.gguf) | Q5_K_S | 1.311 GB | large, low quality loss - recommended |
| [MiniLlama-1.8b-Chat-v0.1-Q5_K_M.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q5_K_M.gguf) | Q5_K_M | 1.343 GB | large, very low quality loss - recommended |
| [MiniLlama-1.8b-Chat-v0.1-Q6_K.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q6_K.gguf) | Q6_K | 1.554 GB | very large, extremely low quality loss |
| [MiniLlama-1.8b-Chat-v0.1-Q8_0.gguf](https://huggingface.co/tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF/blob/main/MiniLlama-1.8b-Chat-v0.1-Q8_0.gguf) | Q8_0 | 2.012 GB | very large, extremely low quality loss - not recommended |


## Downloading instruction

### Command line

Firstly, install Huggingface Client

```shell
pip install -U "huggingface_hub[cli]"
```

Then, downoad the individual model file the a local directory

```shell
huggingface-cli download tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF --include "MiniLlama-1.8b-Chat-v0.1-Q2_K.gguf" --local-dir MY_LOCAL_DIR
```

If you wanna download multiple model files with a pattern (e.g., `*Q4_K*gguf`), you can try:

```shell
huggingface-cli download tensorblock/MiniLlama-1.8b-Chat-v0.1-GGUF --local-dir MY_LOCAL_DIR --local-dir-use-symlinks False --include='*Q4_K*gguf'
```