File size: 3,492 Bytes
fdbe4d7 51e50fa fdbe4d7 51e50fa 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 3226321 fdbe4d7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
---
base_model:
- google/mt5-small
datasets:
- syubraj/roman2nepali-transliteration
language:
- ne
- en
library_name: transformers
license: apache-2.0
pipeline_tag: translation
tags:
- nepali
- roman english
- translation
- transliteration
new_version: syubraj/romaneng2nep
---
# Model Card for Model ID
Model Trained for 8500 steps on <110k dataset.
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Model type:** (google/mt5-small)
- **Language(s) (NLP, Nepali, English):**
- **License:** [Apache license 2.0]
- **Finetuned from model [google/mt5-small]:**
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## How to Get Started with the Model
Use the code below to get started with the model.
```Python
from transformers import AutoTokenizer, MT5ForConditionalGeneration
checkpoint = "syubraj/RomanEng2Nep-v2"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = MT5ForConditionalGeneration.from_pretrained(checkpoint)
# Set max sequence length
max_seq_len = 20
def translate(text):
# Tokenize the input text with a max length of 20
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=max_seq_len)
# Generate translation
translated = model.generate(**inputs)
# Decode the translated tokens back to text
translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)
return translated_text
# Example usage
source_text = "timilai kasto cha?" # Example Romanized Nepali text
translated_text = translate(source_text)
print(f"Translated Text: {translated_text}")
```
### Training Data
[syubraj/roman2nepali-transliteration](https://huggingface.co/datasets/syubraj/roman2nepali-transliteration)
#### Training Hyperparameters
- **Training regime:**
```Python
training_args = Seq2SeqTrainingArguments(
output_dir="/content/drive/MyDrive/romaneng2nep_v2/",
eval_strategy="steps",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=8,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=2,
predict_with_generate=True,
)
```
## Training and Validation Metrics
| Step | Training Loss | Validation Loss | Gen Len |
|------|---------------|-----------------|---------|
| 500 | 21.636200 | 9.776628 | 2.001900 |
| 1000 | 10.103400 | 6.105016 | 2.077900 |
| 1500 | 6.830800 | 5.081259 | 3.811600 |
| 2000 | 6.003100 | 4.702793 | 4.237300 |
| 2500 | 5.690200 | 4.469123 | 4.700000 |
| 3000 | 5.443100 | 4.274406 | 4.808300 |
| 3500 | 5.265300 | 4.121417 | 4.749400 |
| 4000 | 5.128500 | 3.989708 | 4.782300 |
| 4500 | 5.007200 | 3.885391 | 4.805100 |
| 5000 | 4.909600 | 3.787640 | 4.874800 |
| 5500 | 4.836000 | 3.715750 | 4.855500 |
| 6000 | 4.733000 | 3.640963 | 4.962000 |
| 6500 | 4.673500 | 3.587330 | 5.011600 |
| 7000 | 4.623800 | 3.531883 | 5.068300 |
| 7500 | 4.567400 | 3.481622 | 5.108500 |
| 8000 | 4.523200 | 3.445404 | 5.092700 |
| 8500 | 4.464000 | 3.413630 | 5.132700 |
|