|
--- |
|
license: gpl |
|
datasets: |
|
- roneneldan/TinyStoriesInstruct |
|
language: |
|
- ta |
|
- en |
|
library_name: transformers |
|
inference: |
|
parameters: |
|
max_new_tokens: 120 |
|
repetition_penalty: 1.4 |
|
temperature: 0.01 |
|
widget: |
|
- text: | |
|
சொற்கள்: |
|
வீழ்ச்சி, சீட்டு, பிடிவாதம் |
|
சுருக்கம்: |
|
example_title: Tamil Story with words 1 |
|
- text: | |
|
சொற்கள்: |
|
ஓட்டம், பயணம், குழப்பம் |
|
சுருக்கம்: |
|
example_title: Tamil Story with words 2 |
|
- text: | |
|
சொற்கள்: |
|
உதவி, பதிவு, சங்கடம் |
|
சுருக்கம்: |
|
example_title: Tamil Story with words 3 |
|
- text: | |
|
சொற்கள்: |
|
வாக்குறுதி, எலி, பெரியது |
|
சுருக்கம்: |
|
example_title: Tamil Story with words 4 |
|
- text: | |
|
Words: prevent, car, broken |
|
Features: Dialogue, Twist |
|
example_title: Story in English |
|
- text: | |
|
சொற்கள்: |
|
திரும்பு, வாசனை திரவியம், துணிச்சல் |
|
சுருக்கம்: |
|
example_title: Tamil Story with words 5 |
|
--- |
|
|
|
## Tamillama_Tiny: A 30M tiny llama model trained to tell stories in Tamil |
|
### TL;DR: |
|
This is an experimental model inspired by the paper https://arxiv.org/abs/2305.07759 - How Small Can Language Models Be and Still Speak Coherent English?. |
|
|
|
Extended the same concept for Tamil. A 30M parameter LLaMA architecture model that outputs coherent Tamil is preseted here. |
|
|
|
Additional experimentation which is included in the model: |
|
1. This is a multilanguage model as it can output both English and Tamil stories. |
|
2. The model also does translation of stories from Engish to tamil and vice versa. To see the translation feature, set the max_new_tokens > 512. |
|
3. Translation of original stories from the tinystories dataset was done using [IndicTrans](https://ai4bharat.iitm.ac.in/indic-trans) |
|
|
|
For now, this is a toy model for researchers, students and LLM enthusiasts to play with the linquistic capability of the model. |
|
|
|
## Weights Release, License and Usage |
|
We release the weights in two formats: Hugging Face transformers format and GGML format to use with CTransformers or LLaMA.cpp. |
|
|
|
This is not fit for any practical purpose other than for research/experimentation use cases. |
|
|
|
Usage: |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("RajuKandasamy/tamillama_tiny_30m") |
|
model = AutoModelForCausalLM.from_pretrained("RajuKandasamy/tamillama_tiny_30m") |
|
prompt = f"""சொற்கள்: |
|
வாக்குறுதி, எலி, பெரியது |
|
சுருக்கம்:""" |
|
input_ids = tokenizer(prompt, return_tensors="pt").input_ids |
|
|
|
generation_output = model.generate( |
|
input_ids=input_ids, max_new_tokens=256 |
|
) |
|
print(tokenizer.decode(generation_output[0])) |
|
``` |