Model Card for Model ID
This model is trained on PonniyinSelvan tamil corpus dataset.
Model Details
Base model used is EleutherAI's Pythia 1.4b
Model Description
- Finetuned from model [optional]: Pythia 1.4b
Uses
Purely education and research purposes only. Not fit for any kind of practical use.
Bias, Risks, and Limitations
The base model Bias, Risks and Limitations apply
How to Get Started with the Model
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "RajuKandasamy/ponniyinselvan_1.4b_alpha"
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained(model_path, load_in_8bit=False).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model.eval()
prompt="""வந்தியத்தேவன்"""
input_ids = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
attention_mask = torch.ones_like(input_ids).to(model.device)
print("Thinking ...\n ")
with torch.no_grad():
output = model.generate(input_ids=input_ids, attention_mask=attention_mask, max_length=256, early_stopping=False, temperature=0.9, top_p=0.9,top_k=500, do_sample=True,output_scores=True, pad_token_id=tokenizer.eos_token_id, repetition_penalty=1.2,eos_token_id=tokenizer.eos_token_id)
output_str = tokenizer.decode(output[0], skip_special_tokens=False)
print(output_str)
Training Details
10 epochs
Training Data
ponniyinselvan text corpus
Training Procedure
Casual Language Modelling, With custom BPE tokenizer
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.