KathirKs
/

gemma-2b-hindi

Text Generation

Model card Files Files and versions Community

Gemma-2b-hindi:

Gemma-2b-hindi is a transformer model that is continual pretrained using 300 Billion tokens in Hindi using the Gemma-2-2b as the base model.

Hyperparameters:

learning_rate: 2E-4 # set low for fine-tuning
weight_decay: 0.1
min_lr_ratio: 0.00225
warmup: 0.01
decay: 0.99
rewarmup: 0.01

Code:

The levanter repository is used to train the model on google cloud tpus.
Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC) .

Contact:

If you have any queries or issues, reach out to: Kathir

Downloads last month: 1,410

Safetensors

Model size

2.51B params

Tensor type

BF16

·

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for KathirKs/gemma-2b-hindi

Base model

google/gemma-2-2b

Finetuned

(484)

this model

Dataset used to train KathirKs/gemma-2b-hindi