FINGU-AI
/

QWEN2.5-7B-Bnk-3e

+---
+language:
+- ko
+- uz
+- en
+- ru
+- zh
+- ja
+- km
+- my
+- si
+- tl
+- th
+- vi
+- kk
+- bn
+- mn
+- id
+- ne
+- pt
+tags:
+- translation
+- multilingual
+- korean
+- uzbek
+datasets:
+- custom_parallel_corpus
+license: mit
+---
+# QWEN2.5-7B-Bnk-7e
+## Model Description
+QWEN2.5-7B-Bnk-5e is a multilingual translation model based on the QWEN 2.5 architecture with 7 billion parameters. It specializes in translating multiple languages to Korean and Uzbek.
+## Intended Uses & Limitations
+The model is designed for translating text from various Asian and European languages to Korean and Uzbek. It can be used for tasks such as:
+- Multilingual document translation
+- Cross-lingual information retrieval
+- Language learning applications
+- International communication assistance
+Please note that while the model strives for accuracy, it may not always produce perfect translations, especially for idiomatic expressions or highly context-dependent content.
+## Training and Evaluation Data
+The model was fine-tuned on a diverse dataset of parallel texts covering the supported languages. Evaluation was performed on held-out test sets for each language pair.
+## Training Procedure
+Fine-tuning was performed on the QWEN 2.5 7B base model using custom datasets for the specific language pairs.
+## Supported Languages
+The model supports translation from the following languages to Korean and Uzbek:
+- Kazakh (kk)
+- Russian (ru)
+- Thai (th)
+- Chinese (Simplified) (zh)
+- Chinese (Traditional) (zh-tw, zh-hant)
+- Bengali (bn)
+- Mongolian (mn)
+- Indonesian (id)
+- Nepali (ne)
+- English (en)
+- Khmer (km)
+- Portuguese (pt)
+- Sinhala (si)
+- Korean (ko)
+- Tagalog (tl)
+- Burmese (my)
+- Vietnamese (vi)
+- Japanese (ja)
+## How to Use
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+model_name = "FINGU-AI/QWEN2.5-7B-Bnk-5e"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
+# Example usage
+source_text = "Hello, how are you?"
+source_lang = "en"
+target_lang = "ko"  # or "uz" for Uzbek
+input_text = f"Translate from {source_lang} to {target_lang}: {source_text}"
+input_ids = tokenizer(input_text, return_tensors="pt").input_ids
+outputs = model.generate(input_ids, max_length=100)
+translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(translated_text)
+```
+## Performance
+## Limitations
+- The model's performance may vary across different language pairs and domains.
+- It may struggle with very colloquial or highly specialized text.
+- The model may not always capture cultural nuances or context-dependent meanings accurately.
+## Ethical Considerations
+- The model should not be used for generating or propagating harmful, biased, or misleading content.
+- Users should be aware of potential biases in the training data that may affect translations.
+- The model's outputs should not be considered as certified translations for official or legal purposes without human verification.
+## Citation
+```bibtex
+@misc{fingu2023qwen25,
+  author = {FINGU AI and AI Team},
+  title = {QWEN2.5-7B-Bnk-7e: A Multilingual Translation Model},
+  year = {2024},
+  publisher = {Hugging Face},
+  journal = {Hugging Face Model Hub},
+  howpublished = {\url{https://huggingface.co/FINGU-AI/QWEN2.5-7B-Bnk-5e}}
+}