Text Generation
File size: 4,044 Bytes
f105d36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4edad82
 
 
 
f105d36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
license: apache-2.0
datasets:
- Epiculous/SynthRP-Gens-v1-Filtered-n-Cleaned
- Epiculous/Synthstruct-Gens-v1-Filtered-n-Cleaned
language:
- en
- fr
- de
- es
- it
- pt
- ru
- zh
- ja
pipeline_tag: text-generation
---
### exl2 quant (measurement.json in main branch)
---
### check revisions for quants (3bpw,4bpw,5bpw,6bpw,8bpw)
---


![image/png](https://cdn-uploads.huggingface.co/production/uploads/64adfd277b5ff762771e4571/ijVNJF9HePkQCjejXZLcI.png)

Back from the dead! Hoping to make something cool to share with everyone! Introducing Crimson Dawn! Built atop the impressive [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407); Crimson Dawn was built with the idea that AI should not be a boring bland generic assistant, but something that you can connect with on a more personal level. Something that can be interesting in a Roleplay, but useful as an assistant too.

## Prompting
Crimson Dawn was trained with the Mistral Instruct template, therefore it should be prompted in the same way that you would prompt any other mistral model.

```
"[INST] Prompt goes here [/INST]"
```

### Current Top Sampler Settings
```json
{
    "temp": 1.25,
    "temperature_last": true,
    "top_p": 1,
    "top_k": -1,
    "top_a": 0,
    "tfs": 1,
    "epsilon_cutoff": 0,
    "eta_cutoff": 0,
    "typical_p": 1,
    "min_p": 0.3,
    "rep_pen": 1,
    "rep_pen_range": 0,
    "rep_pen_decay": 0,
    "rep_pen_slope": 1,
    "no_repeat_ngram_size": 0,
    "penalty_alpha": 0,
    "num_beams": 1,
    "length_penalty": 1,
    "min_length": 0,
    "encoder_rep_pen": 1,
    "freq_pen": 0,
    "presence_pen": 0,
    "skew": 0,
    "do_sample": true,
    "early_stopping": false,
    "dynatemp": false,
    "min_temp": 0,
    "max_temp": 2,
    "dynatemp_exponent": 1,
    "smoothing_factor": 0,
    "smoothing_curve": 1,
    "dry_allowed_length": 2,
    "dry_multiplier": 0,
    "dry_base": 1.75,
    "dry_sequence_breakers": "[\"\\n\", \":\", \"\\\"\", \"*\"]",
    "dry_penalty_last_n": 0,
    "add_bos_token": true,
    "ban_eos_token": false,
    "skip_special_tokens": true,
    "mirostat_mode": 0,
    "mirostat_tau": 5,
    "mirostat_eta": 0.1,
    "guidance_scale": 1,
    "negative_prompt": "",
    "grammar_string": "",
    "json_schema": {},
    "banned_tokens": "",
    "sampler_priority": [
        "temperature",
        "dynamic_temperature",
        "quadratic_sampling",
        "top_k",
        "top_p",
        "typical_p",
        "epsilon_cutoff",
        "eta_cutoff",
        "tfs",
        "top_a",
        "min_p",
        "mirostat"
    ],
    "samplers": [
        "top_k",
        "tfs_z",
        "typical_p",
        "top_p",
        "min_p",
        "temperature"
    ],
    "ignore_eos_token": false,
    "spaces_between_special_tokens": true,
    "speculative_ngram": false,
    "sampler_order": [
        5,
        6,
        0,
        1,
        2,
        3,
        4
    ],
    "logit_bias": [],
    "ignore_eos_token_aphrodite": false,
    "spaces_between_special_tokens_aphrodite": true,
    "rep_pen_size": 0,
    "genamt": 1024,
    "max_length": 16384
}
```

## Training
Training was done twice over 2 epochs each on two 2x [NVIDIA A6000 GPUs](https://www.nvidia.com/en-us/design-visualization/rtx-a6000/) using LoRA. A two-phased approach was used in which the base model was trained 2 epochs on RP data, the LoRA was then applied to base. Finally, the new modified base was trained 2 epochs on instruct, and the new instruct LoRA was applied to the modified base, resulting in what you see here.

[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)

## Special Thanks
Special thanks to my friends over at Anthracite! Without their help and Kalomaze starting the synthetic data script, none of this would have been possible.
Also want to thank my friends in The Chaotic Neutrals for their friendship, support, and guidance.