File size: 3,001 Bytes
efa9cba c0c36b5 efa9cba ba9e6a8 efa9cba b6c9e4c ba9e6a8 b6c9e4c acc75ba ba9e6a8 9d87bca ba9e6a8 d736f33 ba9e6a8 e563402 ba9e6a8 28679a1 ba9e6a8 fb403e7 ba9e6a8 c0c36b5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
license: gpl
datasets:
- roneneldan/TinyStoriesInstruct
language:
- ta
- en
library_name: transformers
inference:
parameters:
max_new_tokens: 120
repetition_penalty: 1.4
temperature: 0.01
widget:
- text: |
சொற்கள்:
வீழ்ச்சி, சீட்டு, பிடிவாதம்
சுருக்கம்:
example_title: Tamil Story with words 1
- text: |
சொற்கள்:
ஓட்டம், பயணம், குழப்பம்
சுருக்கம்:
example_title: Tamil Story with words 2
- text: |
சொற்கள்:
உதவி, பதிவு, சங்கடம்
சுருக்கம்:
example_title: Tamil Story with words 3
- text: |
சொற்கள்:
வாக்குறுதி, எலி, பெரியது
சுருக்கம்:
example_title: Tamil Story with words 4
- text: |
Words: prevent, car, broken
Features: Dialogue, Twist
example_title: Story in English
- text: |
சொற்கள்:
திரும்பு, வாசனை திரவியம், துணிச்சல்
சுருக்கம்:
example_title: Tamil Story with words 5
---
## Tamillama_Tiny: A 30M tiny llama model trained to tell stories in Tamil
### TL;DR:
This is an experimental model inspired by the paper https://arxiv.org/abs/2305.07759 - How Small Can Language Models Be and Still Speak Coherent English?.
Extended the same concept for Tamil. A 30M parameter LLaMA architecture model that outputs coherent Tamil is preseted here.
Additional experimentation which is included in the model:
1. This is a multilanguage model as it can output both English and Tamil stories.
2. The model also does translation of stories from Engish to tamil and vice versa. To see the translation feature, set the max_new_tokens > 512.
3. Translation of original stories from the tinystories dataset was done using [IndicTrans](https://ai4bharat.iitm.ac.in/indic-trans)
For now, this is a toy model for researchers, students and LLM enthusiasts to play with the linquistic capability of the model.
## Weights Release, License and Usage
We release the weights in two formats: Hugging Face transformers format and GGML format to use with CTransformers or LLaMA.cpp.
This is not fit for any practical purpose other than for research/experimentation use cases.
Usage:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("RajuKandasamy/tamillama_tiny_30m")
model = AutoModelForCausalLM.from_pretrained("RajuKandasamy/tamillama_tiny_30m")
prompt = f"""சொற்கள்:
வாக்குறுதி, எலி, பெரியது
சுருக்கம்:"""
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generation_output = model.generate(
input_ids=input_ids, max_new_tokens=256
)
print(tokenizer.decode(generation_output[0]))
``` |