You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Description

mitya is a model based on Dostoevsky's renowned works, including but not limited to: The Brothers Karamazov, Crime & Punishment, Demons, etc. It is trained upon the open-sourced model Mistral-7b-v0.3 and comes in 6 distinct quantizations on Ollama

Intended Cause

before everyone, for everyone and everything

Bias, Risks, and Limitations

this model was initially trained on 7,000 question-answer pairs with LoRA, and later on adapted to its base model. given the limited training examples it was fine-tuned on, expect minor, if not any (for i spitefully claim), errors with regards to its syntax and so on

Usage

  • Inference:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

path = "tri282/dostoevskyGPT_merged"

tokenizer = AutoTokenizer.from_pretrained(path)
model = AutoModelForCausalLM.from_pretrained(path)

input_text = "your text here"
inputs = tokenizer(input_text, return_tensors = "pt")

with torch.no_grad():
____outputs = model.generate(**inputs, max_new_tokens = 250)

output_text = tokenizer.decode(outputs[0], skip_special_tokens = True)
print(output_text)

  • Download:

from huggingface_hub import snapshot_download

path = "tri282/dostoevskyGPT_merged"
snapshot_download(repo_id = path, local_dir = "./your_directory_here")

Training Data

currently propriety

Training Hyperparameters

  • Training regime: fp16 mixed precision
  • Epochs: 3
  • Learning Rate: 2e-4
  • Batch Size: 16
  • Rank, LoRA Alpha, LoRA Dropout: 64, 96, 0.1

Speeds, Sizes, Times [optional]

this model was trained for 6 hours on Tesla L4 GPU. it is roughly 27GB with float32 precision, with other quantizations available on Ollama

Evaluation

image/png

Summary

i hold firm awareness of the current limitations with regards to my model, that being said, i had a great time testing it out. i ask nothing but your great expectations on future optimizations and versions

Citations

special thanks to Dostoevsky himself, cordially

Downloads last month
0
Safetensors
Model size
7.25B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for tri282/dostoevskyGPT_merged

Quantizations
1 model