--- base_model: - google/mt5-small datasets: - syubraj/roman2nepali-transliteration language: - ne - en library_name: transformers license: apache-2.0 pipeline_tag: translation tags: - nepali - roman english - translation - transliteration new_version: syubraj/romaneng2nep --- # Model Card for Model ID Model Trained for 8500 steps on <110k dataset. ### Model Description This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. - **Model type:** (google/mt5-small) - **Language(s) (NLP, Nepali, English):** - **License:** [Apache license 2.0] - **Finetuned from model [google/mt5-small]:** ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## How to Get Started with the Model Use the code below to get started with the model. ```Python from transformers import AutoTokenizer, MT5ForConditionalGeneration checkpoint = "syubraj/RomanEng2Nep-v2" tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = MT5ForConditionalGeneration.from_pretrained(checkpoint) # Set max sequence length max_seq_len = 20 def translate(text): # Tokenize the input text with a max length of 20 inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len) # Generate translation translated = model.generate(**inputs) # Decode the translated tokens back to text translated_text = tokenizer.decode(translated[0], skip_special_tokens=True) return translated_text # Example usage source_text = "timilai kasto cha?" # Example Romanized Nepali text translated_text = translate(source_text) print(f"Translated Text: {translated_text}") ``` ### Training Data [syubraj/roman2nepali-transliteration](https://huggingface.co/datasets/syubraj/roman2nepali-transliteration) #### Training Hyperparameters - **Training regime:** ```Python training_args = Seq2SeqTrainingArguments( output_dir="/content/drive/MyDrive/romaneng2nep_v2/", eval_strategy="steps", learning_rate=2e-5, per_device_train_batch_size=16, per_device_eval_batch_size=8, weight_decay=0.01, save_total_limit=3, num_train_epochs=2, predict_with_generate=True, ) ``` ## Training and Validation Metrics | Step | Training Loss | Validation Loss | Gen Len | |------|---------------|-----------------|---------| | 500 | 21.636200 | 9.776628 | 2.001900 | | 1000 | 10.103400 | 6.105016 | 2.077900 | | 1500 | 6.830800 | 5.081259 | 3.811600 | | 2000 | 6.003100 | 4.702793 | 4.237300 | | 2500 | 5.690200 | 4.469123 | 4.700000 | | 3000 | 5.443100 | 4.274406 | 4.808300 | | 3500 | 5.265300 | 4.121417 | 4.749400 | | 4000 | 5.128500 | 3.989708 | 4.782300 | | 4500 | 5.007200 | 3.885391 | 4.805100 | | 5000 | 4.909600 | 3.787640 | 4.874800 | | 5500 | 4.836000 | 3.715750 | 4.855500 | | 6000 | 4.733000 | 3.640963 | 4.962000 | | 6500 | 4.673500 | 3.587330 | 5.011600 | | 7000 | 4.623800 | 3.531883 | 5.068300 | | 7500 | 4.567400 | 3.481622 | 5.108500 | | 8000 | 4.523200 | 3.445404 | 5.092700 | | 8500 | 4.464000 | 3.413630 | 5.132700 |