prithivMLmods's picture
Update README.md
e75c1fb verified
|
raw
history blame
4.89 kB
---
license: creativeml-openrail-m
datasets:
- prithivMLmods/Prompt-Enhancement-Mini
- gokaygokay/prompt-enhancement-75k
- gokaygokay/prompt-enhancer-dataset
language:
- en
base_model:
- Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- Qwen2.5
- Prompt_Enhance
- 7B
- Instruct
- safetensors
- pytorch
- Promptist-Instruct
- text-generation-inference
- art
---
### Novaeus-Promptist-7B-Instruct Uploaded Model Files
The **Novaeus-Promptist-7B-Instruct** is a fine-tuned large language model derived from the **Qwen2.5-7B-Instruct** base model. It is optimized for **prompt enhancement, text generation**, and **instruction-following tasks**, providing high-quality outputs tailored to various applications.
| **File Name [ Uploaded Files ]** | **Size** | **Description** | **Upload Status** |
|--------------------------------------------|---------------|------------------------------------------|-------------------|
| `.gitattributes` | 1.57 kB | Git attributes configuration for LFS. | Uploaded |
| `README.md` | 400 Bytes | Documentation about the model. | Updated |
| `added_tokens.json` | 657 Bytes | Custom tokens for tokenizer. | Uploaded |
| `config.json` | 860 Bytes | Configuration for the model. | Uploaded |
| `generation_config.json` | 281 Bytes | Configuration for text generation. | Uploaded |
| `merges.txt` | 1.82 MB | Byte-pair encoding (BPE) merge rules. | Uploaded |
| `pytorch_model-00001-of-00004.bin` | 4.88 GB | Model weights (split part 1). | Uploaded (LFS) |
| `pytorch_model-00002-of-00004.bin` | 4.93 GB | Model weights (split part 2). | Uploaded (LFS) |
| `pytorch_model-00003-of-00004.bin` | 4.33 GB | Model weights (split part 3). | Uploaded (LFS) |
| `pytorch_model-00004-of-00004.bin` | 1.09 GB | Model weights (split part 4). | Uploaded (LFS) |
| `pytorch_model.bin.index.json` | 28.1 kB | Index file for model weights. | Uploaded |
| `special_tokens_map.json` | 644 Bytes | Map of special tokens for tokenizer. | Uploaded |
| `tokenizer.json` | 11.4 MB | Tokenizer data in JSON format. | Uploaded (LFS) |
| `tokenizer_config.json` | 7.73 kB | Tokenizer configuration file. | Uploaded |
| `vocab.json` | 2.78 MB | Vocabulary for tokenizer. | Uploaded |
---
### **Key Features:**
1. **Prompt Refinement:**
Designed to enhance input prompts by rephrasing, clarifying, and optimizing for more precise outcomes.
2. **Instruction Following:**
Accurately follows complex user instructions for various generation tasks, including creative writing, summarization, and question answering.
3. **Customization and Fine-Tuning:**
Incorporates datasets specifically curated for prompt optimization, enabling seamless adaptation to specific user needs.
---
### **Training Details:**
- **Base Model:** [Qwen2.5-7B-Instruct](#)
- **Datasets Used for Fine-Tuning:**
- **gokaygokay/prompt-enhancer-dataset:** Focuses on prompt engineering with 17.9k samples.
- **gokaygokay/prompt-enhancement-75k:** Encompasses a wider array of prompt styles with 73.2k samples.
- **prithivMLmods/Prompt-Enhancement-Mini:** A compact dataset (1.16k samples) for iterative refinement.
---
### **Capabilities:**
- **Prompt Optimization:**
Automatically refines and enhances user-input prompts for better generation results.
- **Instruction-Based Text Generation:**
Supports diverse tasks, including:
- Creative writing (stories, poems, scripts).
- Summaries and paraphrasing.
- Custom Q&A systems.
- **Efficient Fine-Tuning:**
Adaptable to additional fine-tuning tasks by leveraging the model's existing high-quality instruction-following capabilities.
---
### **Usage Instructions:**
1. **Setup:**
- Ensure all necessary model files, including shards, tokenizer configurations, and index files, are downloaded and placed in the correct directory.
2. **Load Model:**
Use PyTorch or Hugging Face Transformers to load the model and tokenizer. Ensure `pytorch_model.bin.index.json` is correctly set for efficient shard-based loading.
3. **Customize Generation:**
Adjust parameters in `generation_config.json` to control aspects such as temperature, top-p sampling, and maximum sequence length.
---