Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

bengali-t5-base

bengali-t5-base is a model trained on the Bengali portion of MT5 dataset. We used the T5-base model for this model.

Flax/Jax Community Week, organized by HuggingFace and TPU usage sponsored by Google.

The model is trained on around ~11B tokens (64 size batch, 512 tokens, 350k steps).

load tokenizer

>>> tokenizer = transformers.AutoTokenizer.from_pretrained("flax-community/bengali-t5-base")
>>> tokenizer.encode("আমি বাংলার গান গাই")
>>> tokenizer.decode([93, 1912, 814, 5995, 3, 1])
[93, 1912, 814, 5995, 3, 1]
'আমি বাংলার গান গাই </s>'

load model

>>> config  = T5Config.from_pretrained("flax-community/bengali-t5-base")
>>> model = FlaxT5ForConditionalGeneration.from_pretrained("flax-community/bengali-t5-base", config=config)

The model is trained on de-noising objectives followed by the script here and here. Currently This model doesn't have any generation capability. If you want this model to have generation capability, please do a finetuning on prefix-LM objective mentioned in the paper.

See the tensorboard log in Training metrics tab.

Please note that we haven't finetuned the model in any downstream task.

Proposal

Participants

Useful links

Downloads last month
58
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.