metadata
language: fa
license: apache-2.0
tags:
- Farsi
Arnavāz (ارنواز)
Model Description: Arnavaz/gpt-arnavaz-beta is gpt2 language model that is fine-tuned using bolbolzaban/gpt2-persian pretrained model. bolbolzaban/gpt2-persian has been trained similar to gpt2-medium with differences in context size, tokenizer and language (Read more).
- Developed by: Rezā Latifi
- Model Type: Transformer-based language model
- Language: Persian (All characters other than the Persian alphabet are replaced with special tokens)
- License: Modified MIT License
- Related Models: bolbolzaban/gpt2-persian, gpt2-medium
- Resources for more information:
How to utilize
Using a pipeline for text generation, Arnavaz can be utilized like this:
from transformers import pipeline, AutoTokenizer, GPT2LMHeadModel, AutoConfig
tokenizer = AutoTokenizer.from_pretrained('Arnavaz/gpt2-arnavaz-beta')
model = GPT2LMHeadModel.from_pretrained('Arnavaz/gpt2-arnavaz-beta')
config = AutoConfig.from_pretrained('Arnavaz/gpt2-arnavaz-beta', max_length=512)
generator = pipeline('text-generation', model, tokenizer=tokenizer, config=config)
def getEloquent(ineloquent):
result = generator(f"[BOS]{ineloquent}[SEP]")[0]['generated_text']
return result[result.find('[SEP]')+5:]
sample = getEloquent('استفاده از کاغذ پاپیروس برای نوشتن کتاب از حدود دو هزار سال قبل از میلاد در مصر رایج شد.')