File size: 3,265 Bytes
e5f5139
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0449737
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
base_model: unsloth/mistral-7b-instruct-v0.3-bnb-4bit
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- mistral
- gguf
---

# Uploaded  model

- **Developed by:** Deeokay
- **License:** apache-2.0
- **Finetuned from model :** unsloth/mistral-7b-instruct-v0.3-bnb-4bit

This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

# README

This is a test model on a the following 
- a private dataset
- customization on tokenization to llama3 template 
- Works with Ollama create with just "FROM path/to/model" as Modelfile (requires to add llama3 template works no issues)

# HOW TO USE

The whole point of conversion for me was I wanted to be able to to use it through Ollama or (other local options)
For Ollama, it required to be a GGUF file. Once you have this it is pretty straight forward (if it is in llama3 which this model is)

Quick Start: 
- You must already have Ollama running in your setting
- Download the unsloth.Q4_K_M.gguf model from Files
- In the same directory create a file call "Modelfile"
- Inside the "Modelfile" type

```python
FROM ./unsloth.Q4_K_M.gguf

PARAMETER temperature 0.6
PARAMETER repeat_penalty 1.3
PARAMETER top_p 0.6
PARAMETER top_k 30

PARAMETER stop <|start_header_id|>
PARAMETER stop <|end_header_id|>
PARAMETER stop <|eot_id|>

TEMPLATE "{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"
```

- Save a go back to the folder (folder where model + Modelfile exisit)
- Now in terminal make sure you are in the same location of the folder and type in the following command

```python
ollama create mycustomai  # "mycustomai" <- you can name it anything u want
```

This GGUF is based on mistral-7b-v0.3 


# NOTE: DISCLAIMER

Please note this is not for the purpose of production, but result of Fine Tuning through self learning

The llama3 Special Tokens where used to convert the tokenizer.

I wanted to test if the model would understand additional headers that I created such as what my datasets has 
- Analaysis, Classification, Sentiment

Multiple pass through my personalized customized dataset, future updates will be made to this repo. 

If would like to know how I started creating my dataset, you can check this link 
[Crafting GPT2 for Personalized AI-Preparing Data the Long Way (Part1)](https://medium.com/@deeokay/the-soul-in-the-machine-crafting-gpt2-for-personalized-ai-9d38be3f635f)

the training data has the following Template:

```
<|begin_of_text|> <|start_header_id|>user<|end_header_id|>
{{.Prompt}}<|eot_id|><|start_header_id|>analysis<|end_header_id|>
{{.Analysis}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{{.Response}}<|eot_id|><|start_header_id|>classification<|end_header_id|>
{{.Classification}}<|eot_id|><|start_header_id|>sentiment<|end_header_id|>
{{.Sentiment}}<|eot_id|><|end_of_text|> 

```