File size: 4,895 Bytes
234b30d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1814bcf
 
db9b8e8
 
704f547
 
 
db9b8e8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
704f547
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
234b30d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
license: creativeml-openrail-m
datasets:
- prithivMLmods/Prompt-Enhancement-Mini
- gokaygokay/prompt-enhancement-75k
- gokaygokay/prompt-enhancer-dataset
language:
- en
base_model:
- Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- Qwen2.5
- Prompt_Enhance
- 7B
- Instruct
- safetensors
- pytorch
- Promptist-Instruct
- text-generation-inference
- art
---

### Novaeus-Promptist-7B-Instruct Uploaded Model Files

The **Novaeus-Promptist-7B-Instruct** is a fine-tuned large language model derived from the **Qwen2.5-7B-Instruct** base model. It is optimized for **prompt enhancement, text generation**, and **instruction-following tasks**, providing high-quality outputs tailored to various applications.

| **File Name [ Uploaded Files ]**                              | **Size**      | **Description**                          | **Upload Status** |
|--------------------------------------------|---------------|------------------------------------------|-------------------|
| `.gitattributes`                           | 1.57 kB       | Git attributes configuration for LFS.    | Uploaded          |
| `README.md`                                | 400 Bytes     | Documentation about the model.           | Updated           |
| `added_tokens.json`                        | 657 Bytes     | Custom tokens for tokenizer.             | Uploaded          |
| `config.json`                              | 860 Bytes     | Configuration for the model.             | Uploaded          |
| `generation_config.json`                   | 281 Bytes     | Configuration for text generation.       | Uploaded          |
| `merges.txt`                               | 1.82 MB       | Byte-pair encoding (BPE) merge rules.    | Uploaded          |
| `pytorch_model-00001-of-00004.bin`         | 4.88 GB       | Model weights (split part 1).            | Uploaded (LFS)    |
| `pytorch_model-00002-of-00004.bin`         | 4.93 GB       | Model weights (split part 2).            | Uploaded (LFS)    |
| `pytorch_model-00003-of-00004.bin`         | 4.33 GB       | Model weights (split part 3).            | Uploaded (LFS)    |
| `pytorch_model-00004-of-00004.bin`         | 1.09 GB       | Model weights (split part 4).            | Uploaded (LFS)    |
| `pytorch_model.bin.index.json`             | 28.1 kB       | Index file for model weights.            | Uploaded          |
| `special_tokens_map.json`                  | 644 Bytes     | Map of special tokens for tokenizer.     | Uploaded          |
| `tokenizer.json`                           | 11.4 MB       | Tokenizer data in JSON format.           | Uploaded (LFS)    |
| `tokenizer_config.json`                    | 7.73 kB       | Tokenizer configuration file.            | Uploaded          |
| `vocab.json`                               | 2.78 MB       | Vocabulary for tokenizer.                | Uploaded          |

---

### **Key Features:**

1. **Prompt Refinement:**  
   Designed to enhance input prompts by rephrasing, clarifying, and optimizing for more precise outcomes.

2. **Instruction Following:**  
   Accurately follows complex user instructions for various generation tasks, including creative writing, summarization, and question answering.

3. **Customization and Fine-Tuning:**  
   Incorporates datasets specifically curated for prompt optimization, enabling seamless adaptation to specific user needs.

---

### **Training Details:**
- **Base Model:** [Qwen2.5-7B-Instruct](#)
- **Datasets Used for Fine-Tuning:**
  - **gokaygokay/prompt-enhancer-dataset:** Focuses on prompt engineering with 17.9k samples.
  - **gokaygokay/prompt-enhancement-75k:** Encompasses a wider array of prompt styles with 73.2k samples.
  - **prithivMLmods/Prompt-Enhancement-Mini:** A compact dataset (1.16k samples) for iterative refinement.

---
### **Capabilities:**

- **Prompt Optimization:**  
   Automatically refines and enhances user-input prompts for better generation results.
  
- **Instruction-Based Text Generation:**  
   Supports diverse tasks, including:
   - Creative writing (stories, poems, scripts).
   - Summaries and paraphrasing.
   - Custom Q&A systems.

- **Efficient Fine-Tuning:**  
   Adaptable to additional fine-tuning tasks by leveraging the model's existing high-quality instruction-following capabilities.

---

### **Usage Instructions:**

1. **Setup:**  
   - Ensure all necessary model files, including shards, tokenizer configurations, and index files, are downloaded and placed in the correct directory.

2. **Load Model:**  
   Use PyTorch or Hugging Face Transformers to load the model and tokenizer. Ensure `pytorch_model.bin.index.json` is correctly set for efficient shard-based loading.

3. **Customize Generation:**  
   Adjust parameters in `generation_config.json` to control aspects such as temperature, top-p sampling, and maximum sequence length.

---