Edit model card

Model Card for Model ID

Fine-tuned using QLoRA for story generation task.

Model Description

We utilize "Hierarchical Neural Story Generation" dataset and fine-tune the model to generate stories.

The input to the model is structred as follows:

'''

### Instruction: Below is a story idea. Write a short story based on this context.

### Input: [story idea here]

### Response:

'''

  • Developed by: Abdelrahman ’Boda’ Sadallah, Anastasiia Demidova, Daria Kotova
  • Model type: Causal LM
  • Language(s) (NLP): English
  • Finetuned from model [optional]: mistralai/Mistral-7B-v0.1

Model Sources

Uses

The model is the result of our AI project. If you intend to use it, please, refer to the repo.

Recommendations

For improving stories generation, you can play parameters: temeperature, top_p/top_k, repetition_penalty, etc.

Training Details

Training Data

Github for the dataset: https://github.com/kevalnagda/StoryGeneration

Evaluation

Testing Data, Factors & Metrics

Test split of the same dataset.

Metrics

We are using perplexity and BERTScore.

Results

Perplexity: 8.8647

BERTScore: 80.76

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • load_in_8bit: False
  • load_in_4bit: True
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32

Framework versions

  • PEFT 0.6.0.dev0
Downloads last month
6
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for boda/mistral-7b-story-generation-24k

Adapter
(1171)
this model