YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

bengali-t5-base

bengali-t5-base is a model trained on the Bengali portion of MT5 dataset. We used the T5-base model for this model.

Flax/Jax Community Week, organized by HuggingFace and TPU usage sponsored by Google.

The model is trained on around ~11B tokens (64 size batch, 512 tokens, 350k steps).

load tokenizer

>>> tokenizer = transformers.AutoTokenizer.from_pretrained("flax-community/bengali-t5-base")
>>> tokenizer.encode("আমি বাংলার গান গাই")
>>> tokenizer.decode([93, 1912, 814, 5995, 3, 1])

[93, 1912, 814, 5995, 3, 1]
'আমি বাংলার গান গাই </s>'

load model

>>> config  = T5Config.from_pretrained("flax-community/bengali-t5-base")
>>> model = FlaxT5ForConditionalGeneration.from_pretrained("flax-community/bengali-t5-base", config=config)

The model is trained on de-noising objectives followed by the script here and here. Currently This model doesn't have any generation capability. If you want this model to have generation capability, please do a finetuning on prefix-LM objective mentioned in the paper.

See the tensorboard log in Training metrics tab.

Please note that we haven't finetuned the model in any downstream task.

Proposal

Project Proposal

Participants

Useful links

Downloads last month: 28

Safetensors

Model size

248M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support